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CAULIFLOWER FLORAL MERISTEM IDENTITY GENES 
AND METHODS OF USING SAME 

This work was supported by grant DCB-9018749 
awarded by the National Science Foundation. The United 
5 States Government has certain rights in this invention. 

BACKGROUND QE THE INVENTION 

FIELD OF THE INVENTION 

This invention relates generally to the field 
10 of plant flowering and more specifically to genes 
involved in the regulation of flowering. 

BACKGROUND INFORMATION 

A flower is the reproductive structure of a 
flowering plant. Following fertilization, the ovary of 
15 the flower becomes a fruit and bears seeds. As a 
practical consequence, production of fruit and 
seed-derived crops such as grapes, beans, corn, wheat and 
rice is dependent upon flowering. 

Early in the plant life cycle, vegetative 
20 growth occurs, and roots, stems and leaves are formed. 
During the later period of reproductive growth, flowers 
as well as new shoots or branches develop. However, the 
factors responsible for the transition from vegetative to 
reproductive growth, and the onset of flowering, are 
25 poorly und rstood. 
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A variety of external signals, such as length 
of daylight and temperature, affect the time of 
flowering. The time of flowering also is subject to 
genetic controls that prevent young plants from flowering 
5 prematurely- Thus, the pattern of genes expressed in a 
plant is an important determinant of the time of 
flowering. 

Given these external signals and genetic 
controls, a relatively fixed period of vegetative growth 

10 precedes flowering in a particular plant species. The 

length of time required for a crop to mature to flowering 
limits the geographic location in which it can be grown 
and can be an important determinant of yield. In 
addition, since the time of flowering determines when a 

15 plant is reproductively mature, the pace of a plant 
breeding program also depends upon the length of time 
required for a plant to flower. 

Traditionally, plant breeding involves 
generating hybrids of existing plants, which are examined 

20 for improved yield or quality. The improvement of 

existing plant crops through plant breeding is central to 
increasing the amount of food grown in the world since 
the amount of land suitable for agriculture is limited. 
For example, the development of new strains of wheat, 

25 corn and rice through plant breeding has increased the 
yield of these crops grown in underdeveloped countries 
such as Mexico, India and Pakistan. Unfortunately, plant 
breeding is inherently a slow process since plants must 
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be reproductive ly mature before selective breeding can 
proceed . 

For some plant species, the length of time 
needed to mature to flowering is so long that selective 
5 breeding, which requires several rounds of backcrossing 
progeny plants with their parents, is impractical. For 
example, perennial trees such as walnut, hickory, oak, 
maple and cherry do not flower for several years after 
planting. As a result, breeding of such plant species 
10 for insect or disease-resistance or to produce improved 
wood or fruit, for example, would require many years, 
even if only a few rounds of selection were performed. 



Methods of promoting early flowering can make 
breeding of long generation plants such as trees 

15 practical for the first time. Methods of promoting early 
flowering also would be useful for shortening growth 
periods, thereby broadening the geographic range in which 
a crop such as rice, corn or coffee can be grown. 
Unfortunately, methods for promoting early flowering in a 

20 plant have not yet been described. Thus, there is a need 
for methods that promote early flowering. The present 
invention satisfies this need and provides related 
advantages as well. 
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SUMMARY QE TBS INVENTION 

The present invention provides a nucleic acid 
molecule encoding a CAULIFLOWER (CAL) gene product. For 
example, the invention provides a nucleic acid molecule 
5 encoding Arabidopsia thaliana CAL and a nucleic acid 
molecule encoding Braaaica oleracea CAL. 

The invention also provides a nucleic acid 
molecule encoding a truncated CAL gene product. For 
example, the invention provides a nucleic acid molecule 

10 encoding the truncated Braasica oleracea var. botrytis 
CAL gene product. The invention also provides a 
nucleotide sequence that hybridizes under relatively 
stringent conditions to a nucleic acid molecule encoding 
a CAL gene product, a truncated CAL gene product, or a 

15 complementary sequence thereto. 

The invention further provides the Arabidopais 
thaliana CAL gene, Braaaica oleracea CAL gene and 
Braaaica oleracea var. botrytie CAL gene. In addition, 
the invention provides a nucleotide sequence that 
20 hybridizes under relatively stringent conditions to the 
AraJbidopsis thaliana CAL gene, Braaaica oleracea CAL gene 
or Braaaica oleracea var. botrytis CAL gene, or a 
complementary sequence thereto. 
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The invention also provides vectors, including 
expression vectors, containing a nucleic acid molecule 
encoding a CAL gene product. The invention further 
provides a kit for converting shoot meristem to floral 
5 meristem in an angiosperm and a kit for promoting early 
flowering in an angiosperm. 

In addition, the invention provides a CAL 
polypeptide, such as the Arabidopsis thaliana CAL 
polypeptide or the Brassica oleracea CAL polypeptide, as 
10 well as an antibody that specifically binds a CAL 
polypeptide. The invention further provides the 
truncated Brassica oleracea var. botrytis CAL polypeptide 
and an antibody that specifically binds the truncated 
Bra8sica oleracea var. botrytie CAL polypeptide. 



15 The invention further provides a method of 

identifying a Brassica having a modified CAL allele by 
detecting a polymorphism associated with a CAL locus, 
where the CAL locus comprises a modified CAL allele that 
does not encode an active CAL gene product. For example, 

20 the polymorphism can be a restriction fragment length 
polymorphism and the modified CAL allele can be the 
Brassica oleracea var. botrytis CAL allele. 



BRIEF DESCRIPTION OP THE DRAWINGS 



Figure 1 illustrates the nucleotide (SEQ ID 
25 NO: 1) and amino acid (SEQ ID NO: 2) sequence of the 
Arabidopsis thaliana API cDNA. 
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Figure 2 illustrates the nucleotide (SEQ ID 
NO: 3) and amino acid (SEQ ID NO: 4) sequence of the 
Brass ica oleracea API cDNA. 

Figure 3 illustrates the nucleotide (SEQ ID 
5 NO: 5) and amino acid (SEQ ID NO: 6) sequence of the 
Brassica oleracea var. Jbotrytis API cDNA. 

Figure 4 illustrates the nucleotide (SEQ ID 
NO: 7) and amino acid (SEQ ID NO: 8) sequence of the Zea 
mays API cDNA. The GenBank accession number is L46400. 

10 Figure 5 illustrates the nucleotide (SEQ ID 

NO: 9) and amino acid (SEQ ID NO: 10) sequence of the 
AraJbidopsis thaliana CAL cDNA. 

Figure 6 illustrates the nucleotide (SEQ ID 
NO: 11) and amino acid (SEQ ID NO: 12) sequence of the 
15 Brassica oleracea CAL cDNA. 

Figure 7 illustrates the nucleotide (SEQ ID 
NO: 13) and amino acid (SEQ ID NO: 14) sequence of the 
Brassica oleracea var. botrytis CAL cDNA. 

Figure 8 illustrates CAL gene structure and 
20 provides a comparison of various CAL amino acid 
sequences . 

Figure 8A. Exon-intron structure of 
Arabidopsis CAL gene. Exons are shown as boxes and 
introns as a solid line. Sizes (in base pairs) are 
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indicated above. Locations of changes resulting in 
mutant alleles are indicated by arrows. MADS and K 
domains are hatched. 

Figure 8B. An alignment of three deduced amino 
5 acid sequences of CAL cDNAs. The complete Arabidopsis 
thaliana CAL amino acid sequence is displayed. The 
Brassica oleracea CAL (BoCAL) and Braaeica oleracea var. 
botrytis CAL (BobCAL) amino acid sequences are shown 
directly below the Arabidopsis sequence where the 

10 sequences differ. The API amino acid sequence is shown 
for comparison. The MADS domain is indicated in bold and 
the K domain is underlined. GenBank accession numbers 
are as follows: Arabidopsis thaliana CAL (L36925) ; 
Brassica oleracea CAL (L36926) and Brassica oleracea var. 

15 botrytis CAL (L36927) . 

Figure 9 illustrates the nucleotide (SEQ ID 
NO: 15) and amino acid (SEQ ID NO: 16) sequence of the 
Arabidopsis thaliana LEAFY (LFY) cDNA. 

Figure 10 illustrates the genomic sequence of 
2 0 Arabidopsis thaliana API (SEQ ID NO: 17) . 

Figure 11 illustrates the genomic sequence of 
Brassica oleracea API (SEQ ID NO: 18) . 

Figure 12 illustrates the genomic sequence of 
Brassica oleracea var. botrytis API (SEQ ID NO: 19) . 
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Figure 13 illustrates the genomic sequence of 
AraJbidopsis thaliana CAL (SEQ ID NO: 20). 

Figure 14 illustrates the genomic sequence of 
Brasaica oleracea CAL {SEQ ID NO: 21) . 

5 Figure 15 illustrates the genomic sequence of 

Brasaica oleracea var. botrytis CAL (SEQ ID NO: 22). 

Figure 16 illustrates the nucleotide (SEQ ID 
NO: 23) and amino acid (SEQ ID NO: 24) sequence of the 
rat glucocorticoid receptor ligand binding domain. 

10 DETAILED D ESCRIPTION OF THE INVENTION 

The present invention provides a nucleic acid 
molecule encoding a CAULIFLOWER (CAL) gene product, which 
is a floral meristem identity gene product involved in 
the conversion of shoot meristem to floral meristem. For 

15 example, the invention provides a nucleic acid molecule 
encoding Arabidopsis thaliana CAL and a nucleic acid 
molecule encoding Brassica oleracea CAL (BoCAL) (Kempin 
et al., Science , 267:522-525 (1995), which is 
incorporated herein by reference) . As disclosed herein, 

20 a CAL gene product can be expressed in an angiosperm, 

thereby converting shoot meristem to floral meristem in 
the angiosperm or promoting early flowering in the 
angiosperm. The invention also provides a nucleic acid 
molecule encoding a truncated CAL gene product such as a 

25 nucleic acid molecule encoding BraBBica oleracea var. 
botrytia CAL (BobCAL) . The invention also provides a 
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nucleic acid molecule containing the Arahidopsis thaliana 
CAL gene, a nucleic acid molecule containing the Brassica 
oleracea CAL gene and a nucleic acid molecule containing 
the Brassica oleracea var. botrytis CAL gene. The 
5 invention further provides a kit for converting shoot 

meristem to floral meristem and a kit for promoting early 
flowering in an angiosperm. The invention provides a CAL 
polypeptide and an antibody that specifically binds CAL 
polypeptide. In addition, the invention provides the 

10 truncated BobCAL polypeptide and an antibody that 

specifically binds the truncated BobCAL polypeptide. The 
invention further provides a method of identifying a 
Brassica having a modified CAL allele by detecting a 
polymorphism associated with a CAL locus, where the CAL 

15 locus comprises a modified CAL allele that does not 
encode an active CAL gene product. 

The present invention provides a non-naturally 
occurring angiosperm containing a first ectopically 
expressible nucleic acid molecule encoding a first floral 

20 meristem identity gene product. For example, the 

invention provides a transgenic angiosperm containing a 
first ectopically expressible floral meristem identity 
gene product such as APETALA1 (API) , CAULIFLOWER (CAL) or 
LEAFY (LFY) . Such a transgenic angiosperm can be, for 

25 example, a cereal plant, leguminous plant, oilseed plant, 
tree, fruit-bearing plant or ornamental flower. 



A flower, like a leaf or shoot, is derived from 
the shoot apical meristem, which is a collection of 
30 undifferentiated cells set aside during embryogenesis . 
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The production of vegetative structures, such as leaves 
or shoots, and of reproductive structures, such as 
flowers, is temporally segregated, such that a leaf or 
shoot arises early in a plant life cycle, while a flower 
5 develops later. The transition from vegetative to 

reproductive development is the consequence of a process 
termed floral induction (Yanofsky, Ann. RS3L. Plant 
Physiol. Plant Mol . Biol . 46:167-188 (1995)), 

Once induced, shoot apical meristem either 
10 persists and produces floral meristem, which gives rise 
to flowers, and lateral meristem, which gives rise to 
branches, or is itself converted to floral meristem. The 
fate of floral meristem is to differentiate into a single 
flower having a fixed number of floral organs in a 
15 whorled arrangement. Dicots, for example, contain four 
whorls (concentric rings) in which sepals (first whorl) 
and petals (second whorl) surround stamens (third whorl) 
and carpels (fourth whorl) . 

Although shoot meristem and floral meristem 
20 both consist of meristemic tissue, shoot meristem is 
distinguishable from the more specialized floral 
meristem. Shoot meristem generally is indeterminate and 
gives rise to an unspecified number of floral and lateral 
meristems. In contrast, floral meristem is determinate 
25 and gives rise to the fixed number of floral organs that 
comprise a flower. 

By convention herein, a wild- type gene sequence 
is represented in upper case italic letters (for example, 
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APETALA1) , and a wild-type gene product is represented in 
upper case non- italic letters (APETALA1) . Further, a 
mutant gene allele is represented in lower case italic 
letters (apl) , and a mutant gene product is represented 
5 in lower case non- italic letters (apl) . 

Genetic studies have identified a number of 
genes involved in regulating flower development. These 
genes can be classified into different groups depending 
on their function. Flowering time genes, for example, 

10 are involved in floral induction and regulate the 

transition from vegetative to reproductive growth. In 
comparison, the floral meristem identity genes, which are 
the subject matter of the present invention as disclosed 
herein, encode proteins that promote the conversion of 

15 shoot meristem to floral meristem. In addition, floral 
organ identity genes encode proteins that determine 
whether sepals, petals, stamens or carpels are formed 
(Yanofsky, supra, 1995; Weigel, Ann. Rev. Genetics 
29:19-39 (1995)). Some of the floral meristem identity 

20 gene products also have a role in specifying organ 
identity. 

Floral meristem identity genes have been 
identified by characterizing genetic mutations that 
prevent or alter floral meristem formation. Among floral 
25 meristem identity gene mutations in Arabidopsis thalicLna, 
those in the gene LEAFY (LEY) generally have the 
strongest effect on floral meristem identity. Mutations 
in LFY completely transform the basal-most flowers into 
secondary shoots and have variable effects on 
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later-arising (apical) flowers. In comparison, mutations 
in the floral meristem identity gene APETALA1 (API) 
result in replacement of a few basal flowers by 
inflorescence shoots that are not subtended by leaves. 
5 An apical flower produced in an apl mutant has an 

indeterminate structure in which a flower arises within a 
flower. These mutant phenotypes indicate that both API 
and LFY contribute to establishing the identity of the 
floral meristem although neither gene is absolutely 
10 required. The phenotype of lfy apl double mutants, in 
which structures with flower- like characteristics are 
very rare, indicates that LFY and API encode partially 
redundant activities. 

In addition to the LFY and API genes, a third 
15 locus that greatly enhances the apl mutant phenotype has 
been identified in Arabidopsis. This locus, designated 
CAULIFLOWER (CAL) , derives its name from the resulting 
"cauliflower" phenotype, which is strikingly similar to 
the common garden variety of cauliflower. In an apl cal 

20 double mutant, floral meristem that develops behaves as 
shoot meristem in that there is a massive proliferation 
of meristems in the position that normally would be 
occupied by a single flower. However, a plant homozygous 
for a particular cal mutation {cal-D has a normal 

25 phenotype, indicating that API can substitute for the 

loss of CAL in these plants. In addition, because floral 
meristem that forms in an apl mutant behaves as shoot 
meristem in an apl cal double mutant, CAL can largely 
substitute for API in specifying floral meristem. These 

30 genetic data indicate that CAL and API encode activities 
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that are partially redundant in converting shoot meristem 
to floral meristem. 



Other genetic loci play at least minor roles in 
specifying floral meristem identity. For example, 
5 although a mutation in APETALA2 (AP2) alone does not 

result in altered inflorescence characteristics, ap2 apl 
double mutants have indeterminate flowers (flowers with 
shoot-like characteristics) (Bowman et al., Development 
119:721-743 (1993)). Also, mutations in the CLAVATA1 

10 {CLVD gene result in an enlarged meristem and lead to a 
variety of phenotypes (Clark et al., Development 
119:397-418 (1993)). In a clvl apl double mutant, 
formation of flowers is initiated, but the center of each 
flower often develops as an indeterminate inflorescence. 

15 Thus, mutations in CLAVATA1 result in the loss of floral 
meristem identity in the center of wild-type flowers. 
Genetic evidence also indicates that the gene product of 
UNUSUAL FLORAL ORGANS (UFO) plays a role in determining 
the identity of floral meristem. Additional floral 

20 meristem identity genes associated with altered floral 
meristem formation remain to be isolated. 



Mutations in another locus, designated TERMINAL 
FLOWER (TFL) , produce phenotypes that generally are 
reversed as compared to mutations in the floral meristem 
25 identity genes. For example, tfl mutants flower early, 
and the indeterminate apical and lateral meristems 
develop as determinate floral meristems (Alvarez et al . , 
Plant J.. 2:103-116 (1992)). These characteristics 
indicate that the TFL promotes maintenance of shoot 
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raeristem. TFL also acts directly or indirectly to 
negatively regulate API and LFY expression in shoot 
meristem since API and LFY are ectopically expressed in 
the shoot meristem of tfl mutants (Gustaf son-Brown et 
5 al., Cell 76:131-143 (1994); Weigel et al . , Cfill 

69:843-859 (1992)). It is recognized that a plant having 
a mutation in TFL can have a phenotype similar to a 
non-naturally occurring angiosperm of the invention. 
Such tfl mutants, however, are explicitly excluded from 
10 the scope of the present invention. 

The results of such genetic studies indicate 
that several floral meristem identity gene products, 
including API, CAL and LFY, act redundantly to convert 
shoot meristem to floral meristem and that TFL acts 

15 directly or indirectly to negatively regulate expression 
of the floral meristem identity genes. As disclosed 
herein, ectopic expression of a single floral meristem 
identity gene product such as API, CAL or LFY is 
sufficient to convert shoot meristem to floral meristem. 

20 Thus, the present invention provides a non-naturally 
occurring angiosperm that contains an ectopically 
expressible nucleic acid molecule encoding a floral 
meristem identity gene product, provided that such 
ectopic expression is not due to a mutation in an 

25 endogenous TERMINAL FLOWER gene. 

As disclosed herein, an ectopically expressible 
nucleic acid molecule encoding a floral meristem identity 
gene product can be, for example, a transgene encoding a 
floral meristem identity gene product under control of a 
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heterologous gene regulatory element. In addition, such 
an ectopically expressible nucleic acid molecule can be 
an endogenous floral meristem identity gene coding 
sequence that is placed under control of a heterologous 
5 gene regulatory element. The ectopically expressible 
nucleic acid molecule also can be, for example, an 
endogenous floral meristem identity gene having a 
modified gene regulatory element such that the endogenous 
floral meristem identity gene is no longer subject to 
10 negative regulation by TFL. 

The term "ectopically expressible" is used 
herein to refer to a gene transcript or gene product that 
can be expressed in a tissue other than a tissue in which 
it normally is produced. The actual ectopic expression 
thereof is dependent on various factors and can be 
constitutive or inducible expression. As disclosed 
herein, API, which normally is expressed in floral 
meristem, is ectopically expressible in shoot meristem. 
As disclosed herein, when a floral meristem identity gene 
product such as API, CAL or LFY is ectopically expressed 
in shoot meristem, the shoot meristem is converted to 
floral meristem and early flowering cam occur (see 
Examples II, IV and V) . 

In particular, an ectopically expressible 
25 nucleic acid molecule encoding a floral meristem identity 
gene product can be expressed prior to the developmental 
time at which the corresponding endogenous gene normally 
is expressed. For example, an Arabidopsis plant grown 
under continuous light conditions expresses API just 



15 



20 
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prior to day 18, when normal flowering begins. However, 
as disclosed herein, API can be ectopically expressed in 
shoot meristera earlier than day 18, resulting in early 
conversion of shoot meristem to floral meristem and early 
5 flowering. As shown in Example I ID, a transgenic 

Arabidopaia plant that ectopically expresses API in shoot 
meristem under control of a constitutive promoter flowers 
earlier than the corresponding non- transgenic plant (day 
10 as compared to day 18) . 

10 As used herein, the term "floral meristem 

identity gene product" means a gene product that promotes 
conversion of shoot meristem to floral meristem. As 
disclosed herein, expression of a floral meristem 
identity gene product such as API, CAL or LFY in shoot 
15 meristem can convert shoot meristem to floral meristem. 
Furthermore, expression of a floral meristem identity 
gene product in shoot meristem also can promote early 
flowering (Examples IID, IVA and V) . A floral meristem 
identity gene product is distinguishable from a late 
20 flowering gene product or an early flowering gene 

product, which are not encompassed within the present 
invention. In addition, reference is made herein to an 
"inactive" floral meristem identity gene product, as 
exemplified by BobCAL (see below) . Expression of an 
25 inactive floral meristem identity gene product in an 
angiosperm does not result in the conversion of shoot 
meristem to floral meristem in the angiosperm. 

A floral meristem identity gene product can be, 
for example, an API gene product such as AraJbidopsis API, 
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which is a 256 amino acid gene product encoded by the API 
cDNA sequence isolated from Arabidopaia thaliana 
(Figure 5, SEQ ID NO: 2). The Arabidopsis API cDNA 
encodes a highly conserved MADS domain, which can 
5 function as a DNA-binding domain, and a K domain, which 
is structurally similar to the coiled-coil domain of 
keratins and can be involved in protein-protein 
interactions. 

In Arabidopsis, API RNA is expressed in flowers 
but is not detectable in roots, stems or leaves (Mandel 
et al. ( MatU££ 360:273-277 (1992), which is incorporated 
herein by reference) . The earliest detectable expression 
of API RNA is in young floral meristem at the time it 
initially forms on the flanks of shoot meristem. 
Expression of API increases as the floral meristem 
increases in size; no API expression is detectable in 
shoot meristem. In later stages of development, API 
expression ceases in cells that will give rise to 
reproductive organs (stamens and carpels) , but is 
maintained in cells that will give rise to 
non-reproductive organs (sepals and petals; Mandel, 
supra, 1992) • 

As used herein, the term "APETALAl " or "API" 
means a floral meristem identity gene product that is 
25 characterized, in part, by having an amino acid sequence 
that is related to the Arabidopaia API amino acid 
sequence shown in Figure 1 (SEQ ID NO: 2) or to the Zea 
mays API amino acid sequence shown in Figure 4 (SEQ ID 
NO: 8) . In nature, API is expressed in floral meristem. 



10 



15 



20 
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CAULIFLOWER (CAL) is another example of a 
floral meristem identity gene product. As used herein, 
the term "CAULIFLOWER" or "CAL" means a floral meristem 
identity gene product that is characterized in part by 
5 having an amino acid sequence that has at least about 70 
percent identity with the amino acid sequence shown in 
Figure 5 (SEQ ID NO: 10) in the region from amino acid 1 
to amino acid 160 or with the amino acid sequence shown 
in Figure 6 (SEQ ID NO: 12) in the region from amino acid 
10 1 to amino acid 160. In nature, CAL is expressed in 
floral meristem. 

The present invention provides a nucleic acid 
molecule encoding a CAL, including, for example, the 
Arabidopais CAL cDNA sequence shown in Figure 5 (SEQ ID 

15 NO: 9) . As disclosed herein, CAL, like API, contains a 
MADS domain and a K domain. The MADS domains of CAL and 
API differ in only five of 56 amino acid residues, where 
four of the five differences represent conservative amino 
acid replacements. Over the entire sequence, the 

20 Arabidopsis CAL and AraJbidopsis API sequences (SEQ ID 
NOS: 10 and 2) are 76% identical and are 88% similar if 
conservative amino acid substitutions are allowed. 

Similar to the expression pattern of API, CAL 
RNA is expressed in young floral meristem in AraJbidopsis. 
25 However, in contrast to API expression, which is high 

throughout sepal and petal development, CAL expression is 
low in these organs. 
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LEAFY (LFY) is yet another example of a floral 
meristem identity gene product- As used herein, the term 
"LEAFY" or "LFY" means a floral meristem identity gene 
product that is characterized in part by having an amino 
5 acid sequence that is related to the amino acid sequence 
shown in Figure 9 (SEQ ID NO: 16) In nature, LFY is 
expressed in floral meristem as well as during vegetative 
development- As disclosed herein, ectopic expression of 
floral meristem identity gene products, which normally 
10 are expressed in floral meristem, such as API or CAL or 
LFY or combinations thereof, in shoot meristem can 
convert shoot meristem to floral meristem and promote 
early flowering. 

Flower development in Arabidopsis is recognized 

15 in the art as a model for flower development in 

angiosperms in general. Gene orthologs corresponding to 
the AraJbidppsis genes involved in the early steps of 
flower format ion have been identified in distantly 
related plant species, and these gene orthologs show 

20 remarkably similar RNA expression patterns. Mutations in 
these genes also result in phenotypes that correspond to 
the phenotype produced by a similar mutation in 
Arabidopsis. For example, orthologs of the Arabidopsis 
floral meristem identity genes API and LFY and the 

25 Arabidopsis organ identity genes AGAMOUS, APETALA3 and 

PISTILLATA have been isolated from monocots such as maize 
and, where characterized, reveal the anticipated RNA 
expression patterns and related mutant phenotypes. 
(Schmidt et al., Plant, Cell 5:729-737 (1993); and Veit et 

30 al., Plant Cell 5:1205-1215 (1993), each of which is 



WO 97/27287 



PCT/US96/01041 



20 

incorporated herein by reference) . Furthermore, a gene 
ortholog can be functionally interchangeable in that it 
can function across distantly related species boundaries 
(Mandel et al., Cell 71:133-143 (1992), which is 
5 incorporated herein by reference) . Taken together, these 
data suggest that the underlying mechanisms controlling 
the initiation and proper development of flowers are 
conserved across distantly related dicot and monocot 
boundaries. Therefore, results obtained using 
10 AraJbidopsis can be predictive of results that can be 
expected in other angiosperms. 

Floral meristem identity genes in particular 
are conserved throughout the plant kingdom. For example, 
a gene ortholog of Arabidopsis API has been isolated from 
15 Antirrhinum ma jus (snapdragon; Huijser et al., EMBQ J. 
11:1239-1249 (1992), which is herein incorporated by 
reference) . As disclosed herein, an ortholog of 
Arabidopsis API also has been isolated from Zea Mays 
(maize; see Example IA) . Similarly, gene orthologs of 
20 Arabidopsis LFY have been isolated from Antirrhinum 
majus, tobacco and poplar tree (Coen et al . , Cell, 
63:1311-1322 (1990); Kelly et al., Plant Cell 7:225-234 
(1995); and Strauss et al., Mplec. Breed 1:5-26 (1995), 
each of which is incorporated herein by reference) , In 
25 addition, a mutation in the Antirrhinum API ortholog 
results in a phenotype similar to the Arabidopsis apl 
mutant phenotype described above (Huijser et al,, supra, 
1992) . Similarly, a mutation in the Antirrhinum LFY 
ortholog results in a phenotype similar to the 
30 Arabidopsis If y mutant phenotype (Coen et al., supra, 
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1995) - These studies indicate that API and LFY function 
similarly in distantly related angiosperms . 



A floral meristem identity gene product also 
can function across species boundaries. For example, 
5 Arabidopsis LFY can convert shoot meristem to floral 
meristem when expressed in aspen trees (Weigel and 
Nilsson, HaJLUEfi 377:495-500 (1995) , which is incorporated 
herein by reference) . As disclosed herein, a nucleic 
acid molecule encoding an Arabidopsis API or GAL gene 

10 product (SEQ ID NOS: 1 and 9), for example, also can be 
used to convert shoot meristem to floral meristem in an 
angiosperm. Thus, a nucleic acid molecule encoding an 
Arabidopsis API gene product (SEQ ID NO: l) or an 
Arabidopsis CAL gene product (SEQ ID NO: 9) can be 

15 introduced into an angiosperm such as corn, wheat or rice 
and, upon expression, can convert shoot meristem to 
floral meristem in the transgenic angiosperm. 
Furthermore, as disclosed herein, the conserved nature of 
an API or CAL or LFY gene among diverse angiosperms, 

20 allows a nucleic acid molecule encoding a floral meristem 
identity gene product from essentially any angiosperm to 
be introduced into essentially any other angiosperm, 
wherein the expression of the nucleic acid molecule in 
shoot meristem can convert shoot meristem to floral 

25 meristem. 



If desired, a novel API, CAL or LFY sequence 
can be isolated from an angiosperm using a nucleotide 
sequence as a probe and methods well known in the art of 
molecular biology (Sambrook et al. (eds.), Molecular 
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Plainview, NY: Cold Spring Harbor Laboratory Press 
(1989) , which is herein incorporated by reference). As 
exemplified herein and discussed in detail below (see 
5 Example IA) , the API ortholog from Zea Mays (maize; SEQ 
ID NO: 7) was isolated using the Arabidopsia API cDNA as 
a probe (SEQ ID NO: 1) . 

In one embodiment, the invention provides a 
non-naturally occurring angiosperm that contains an 

10 ectopically expressible nucleic acid molecule encoding a 
floral meristem identity gene product and that is 
characterized by early flowering. As used herein, the 
term "characterized by early flowering," when used in 
reference to a non-naturally occurring angiosperm of the 

15 invention, means a non-naturally occurring angiosperm 
that forms flowers sooner than flowers would form on a 
corresponding naturally occurring angiosperm that does 
not ectopically express a floral meristem identity gene 
product, grown under the same conditions. Flowering 

20 times for naturally occurring angiosperms are well known 
in the art and depend, in part, on genetic factors and on 
the environmental conditions, such as day length. Thus, 
given a defined set of environmental conditions, a 
naturally occurring plant will flower at a relatively 

25 predictable time. 

It is recognized that various transgenic plants 
that are characterized by early flowering have been 
described. Such transg nic plants are described herein 
and are readily distinguishable or explicitly excluded 
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from the present invention. For example, a product of a 
"late -flowering gene" can promote early flowering but 
does not specify the conversion of shoot meristem to 
floral meristem. Therefore, a transgenic plant 
5 expressing a late -flowering gene product is 

distinguishable from a non-naturally occurring angiosperm 
of the invention. For example, a transgenic plant 
expressing the late -flowering gene, CONSTANS (CO), 
flowers earlier than a corresponding wild type plant 
10 (Putterill et al. f Cell 80:847-857 (1995)). However, 
expression of exogenous CONSTANS does not convert shoot 
meristem to floral meristem. 

Early flowering also has been observed in a 
transgenic tobacco plant expressing an exogenous rice 
MADS domain gene. Although the product of this gene 
promotes early flowering, it does not specify the 
identity of floral meristem and, thus, cannot convert 
shoot meristem to floral meristem (Chung et al., Plant 
Mol. Biol. 26:657-665 (1994)). Therefore, the 
early- flowering CO and rice MADS domain gene transgenic 
plants are distinguishable from the early- flowering 
non-naturally occurring angiosperms of the invention. 

Mutations in a class of genes known as 
"early- flowering genes" also result in plants that flower 
25 prematurely. Such early flowering genes include, for 
example, EARLY FLOWERING 1-3 (ELF1, ELF 2, ELF3); 
EMBRYONIC FLOWER 1,2 (EMF1, EMF2) ; LONG HYPOCOTYL 1,2 
(HY1, HY2) ; PHYTOCHROME B (PHYB) , SPINDLY (SPY) and 
TERMINAL FLOWER (TFL) (Weigel, supra, 1995). However, 
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the wild type product of an early flowering gene retards 
flowering and is distinguishable from a floral meristem 
identity gene product in that it does not promote 
conversion of shoot meristem to floral meristem. 

5 An ArabidopBis plant having a mutation in the 

TERMINAL FLOWER (TFL) gene flowers early and is 
characterized by the conversion of shoots to flowers 
(Alvarez et al., Plant J . 2:103-116 (1992), which is 
incorporated herein by reference) . However, TFL is not a 

10 floral meristem identity gene product, as defined herein. 
Specifically, it is the loss of TFL that promotes 
conversion of shoot meristem to floral meristem. Since 
the function of TFL is to antagonize formation of floral 
meristem, a tfl mutant, which has lost this antagonist 

15 function, permits conversion of shoot meristem to floral 
meristem. Although TFL is not a floral meristem identity 
gene product and does not itself convert shoot meristem 
to floral meristem, the loss of TFL can result in a plant 
with an ectopically expressed floral meristem identity 

20 gene product. Such tfl mutants, in which a mutation in 
TFL results in conversion of shoot meristem to floral 
meristem, are explicitly excluded from the present 
invention. 

As used herein, the term n non-naturally 
25 occurring angiosperm" means an angiosperm that contains a 
genome that has been modified by man. A transgenic 
angiosperm, for example, contains an exogenous nucleic 
acid molecule and, therefore, contains a genome that has 
been modified by man. Furthermore, an angiosperm that 
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contains, for example, a mutation in an endogenous floral 
meristem identity gene regulatory element as a result of 
exposure to a mutagenic agent by man also contains a 
genome that has been modified by man. In contrast, a 
5 plant containing a spontaneous or naturally occurring 
mutation is not a "non-naturally occurring angiosperm" 
and, therefore, is not encompassed within the invention. 

As used herein, the term "transgenic" refers to 
an angiosperm that contains in its genome an exogenous 
nucleic acid molecule, which cam be derived from the same 
or a different species. The exogenous nucleic acid 
molecule that is introduced into the angiosperm can be a 
gene regulatory element such as a promoter or other 
regulatory element or can be a coding sequence, which can 
be linked to a heterologous gene regulatory element. 

As used herein, the term "angiosperm" means a 
flowering plant. Angiosperms are well known and produce 
a variety of useful products including materials such as 
lumber, rubber, and paper; fibers such as cotton and 
20 linen; herbs and medicines such as quinine and 

vinblastine; ornamental flowers such as roses and 
orchids; and foodstuffs such as grains, oils, fruits and 
vegetables . 

Angiosperms are divided into two broad classes 
25 based on the number of cotyledons, which are seed leaves 
that generally store or absorb food. Thus, a 
monocotyledonous angiosperm is an angiosperm having a 
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single cotyledon, and a dicotyledonous angiosperm is an 
angiosperm having two cotyledons. 

Angiosperms encompass a variety of flowering 
plants, including, for example, cereal plants, leguminous 
5 plants, oilseed plants, trees, fruit-bearing plants and 
ornamental flowers, which general classes are not 
necessarily exclusive. Such angiosperms include for 
example, a cereal plant, which produces an edible grain 
cereal. Such cereal plants include, for example, corn, 

10 rice, wheat, barley, oat, rye, orchardgrass, guinea 

grass, sorghum and turf grass. In addition, a leguminous 
plant is an angiosperm that is a member of the pea family 
(Fabaceae) and produces a characteristic fruit known as a 
legume. Examples of leguminous plants include, for 

15 example, soybean, pea, chickpea, moth bean, broad bean, 
kidney bean, lima bean, lentil, cowpea, dry bean, and 
peanut. Examples of legumes further also include 
alfalfa, birdsfoot trefoil, clover and sainfoin. 
Furthermore, an oilseed plant is an angiosperm that has 

20 seeds useful as a source of oil. Examples of oilseed 
plants include soybean, sunflower, rapeseed and 
cottonseed. 

A tree is an angiosperm and is a perennial 
woody plant, generally with a single stem (trunk) . 
25 Examples of trees include alder, ash, aspen, basswood 
(linden) , beech, birch, cherry, cottonwood, elm, 
eucalyptus, hickory, locust, maple, oak, persimmon, 
poplar, sycamore, walnut and willows. Such trees are 
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used for pulp, paper, and structural material, as well as 
providing a major source of fuel. 



A fruit -bearing plant also is an angiosperm and 
produces a mature, ripened ovary (usually containing 
5 seeds) that is suitable for human or animal consumption. 
Examples of fruit -bearing plants include grape, orange, 
lemon, grapefruit, avocado, date, peach, cherry, olive, 
plum, coconut, apple and pear trees and blackberry, 
blueberry, raspberry, strawberry, pineapple, tomato, 

10 cucumber and eggplant plants. An ornamental flower is an 
angiosperm cultivated for its decorative flower. 
Examples of ornamental flowers include rose, orchid, 
lily, tulip and chrysanthemum, snapdragon, camelia, 
carnation and petunia. The skilled artisan will 

15 recognize that the invention can be practiced on these or 
other angiosperms, as desired. 



In various embodiments, the present invention 
provides a non-naturally occurring angiosperm having an 
ectopically expressible first nucleic acid molecule 

20 encoding a first floral meristem identity gene product, 
provided the first nucleic acid molecule is not 
ectopically expressed due to a mutation in an endogenous 
TFL gene. If desired, a non-naturally occurring 
angiosperm of the invention can contain an ectopically 

25 expressible second nucleic acid molecule encoding a 

second floral meristem identity gene product, which is 
different from the first floral meristem identity gene 
product . 
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An ectopically expressible nucleic acid 
molecule can be expressed, as desired, either 
const i tut ively or inducibly. Such an ectopically 
expressible nucleic acid molecule can be an endogenous 
5 nucleic acid molecule and can contain, for example, a 

mutation in its endogenous gene regulatory element or can 
contain an exogenous, heterologous gene regulatory 
element that is linked to and directs expression of the 
endogenous nucleic acid molecule. In addition, an 
10 ectopically expressible nucleic acid molecule encoding a 
floral meristem identity gene product can be an exogenous 
nucleic acid molecule encoding a floral meristem identity 
gene product and containing a heterologous gene 
regulatory element* 



15 The invention provides, for example, a 

non-naturally occurring angiosperm containing a first 
ectopically expressible nucleic acid molecule encoding a 
first floral meristem identity gene product. If desired, 
a non-naturally occurring angiosperm of the invention can 

20 contain a floral meristem identity gene having a modified 
gene regulatory element and also can contain a second 
ectopically expressible nucleic acid molecule encoding a 
second floral meristem identity gene product, provided 
that neither the first nor second ectopically expressible 

25 nucleic acid molecule is ectopically expressed due to a 
mutation in an endogenous TERMINAL FLOWER gene. 



As used herein, the term "modified gene 
regulatory element" means a regulatory element having a 
mutation that results in ectopic expression in shoot 
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meristem of the floral meristem identity gene regulated 
by the gene regulatory element. Such a gene regulatory 
element can be, for example, a promoter or enhancer 
element and can be positioned 5» or 3 1 to the coding 
5 sequence or within an intronic sequence of the floral 
meristem identity gene. Such a modification can be, for 
example, a nucleotide insertion, deletion or substitution 
and can be produced by chemical mutagenesis using a 
mutagen such as ethylme thane sulfonate (see Example IIIA) 

10 or by insertional mutagenesis using a transposable 

element. For example, a modified gene regulatory element 
can be a functionally inactivated binding site for TFL or 
a gene product regulated by TFL, such that modification 
of the gene regulatory element results in ectopic 

15 expression of the floral meristem identity gene product 
in shoot meristem. 

The invention also provides a transgenic 
angiosperm containing a first exogenous gene promoter 
that regulates a first ectopically expressible nucleic 
20 acid molecule encoding a first floral meristem identity 
gene product and a second exogenous gene promoter that 
regulates a second ectopically expressible nucleic acid 
molecule encoding a second floral meristem identity gene 
product . 

25 The invention also provides a transgenic 

angiosperm containing a first exogenous ectopically 
expressible nucleic acid molecule encoding a first floral 
meristem identity gene product and a second exogenous 
gene promoter that regulates a second ectopically 
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expressible nucleic acid molecule encoding a second 
floral meristem identity gene product, provided that the 
first nucleic acid molecule is not ectopically expressed 
due to a mutation in an endogenous TERMINAL FLOWER gene. 

The invention also provides a transgenic 
angiosperm containing a first exogenous ectopically 
expressible nucleic acid molecule encoding a first floral 
meristem identity gene product and a second exogenous 
ectopically expressible nucleic acid molecule encoding a 
second floral meristem identity gene product, where the 
first floral meristem identity gene product is different 
from the second floral meristem identity gene product and 
provided that neither nucleic acid molecule is 
ectopically expressed due to a mutation in an endogenous 
TERMINAL FLOWER gene. 

The ectopic expression of first and second 
floral meristem identity gene products can be 
particularly useful. For example, ectopic expression of 
API and LFY in a plant promotes flowering earlier than 
20 ectopic expression of API alone or ectopic expression of 
LFY alone. Thus, plant breeding, for example, can be 
further accelerated, if desired. 

First and second floral meristem identity gene 
products can be, for example, API and CAL, or can be API 
25 and LFY or can be CAL and LFY. It should be recognized 
that where a transgenic angiosperm of the invention 
contains two exogenous nucleic acid molecules, the order 
of introducing such a first and a second nucleic acid 



WO 97/27287 



PCT/US96/01041 



31 

molecule is not important for purposes of the present 
invention. Thus, a transgenic angiosperm of the 
invention having, for example, API as the first floral 
meristem identity gene product and CAL as the second 
5 floral meristem identity gene product is equivalent to a 
transgenic angiosperm having CAL as the first floral 
meristem identity gene product and API as the second 
floral meristem identity gene product. 

The invention also provides methods of 
10 converting shoot meristem to floral meristem in an 
angiosperm by ectopically expressing an ectopically 
expressible nucleic acid molecule encoding a floral 
meristem identity gene product in the angiosperm. Thus, 
the invention provides, for example, methods of 
15 converting shoot meristem to floral meristem in an 
angiosperm by introducing an exogenous ectopically 
expressible nucleic acid molecule encoding a floral 
meristem identity gene product into the angiosperm, 
thereby producing a transgenic angiosperm. A floral 
20 meristem identity gene product such as API, CAL or LFY, 
or a chimeric protein containing, in part, a floral 
meristem identity gene product (see below) is useful in 
the methods of the invention. 

As used herein, the term "introducing," when 
25 used in reference to an angiosperm, means transferring an 
exogenous nucleic acid molecule into the angiosperm. For 
example, an exogenous nucleic acid molecule can be 
introduced into an angiosperm by methods such as 
AgroJbacterium- mediated transformation or direct gene 
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transfer methods including microprojectile-mediated 
transformation (Klein et al., Nature 327:70-73 (1987), 
which is incorporated herein by reference) . These and 
other methods of introducing a nucleic acid molecule into 
5 an angiosperm are well known in the art (Bowman et al, 
(ed.), Arahidonais! An Atlas of Morphology and 
Development , New York: Springer (1994); Valvekens et 
al., Prnr. Natl. Acad. Sci.. USA 65:5536-5540 (1988); and 
Wang et al . , Transformation of Plants and Soil 
10 Microorganisms, Cambridge, UK: University Press (1995), 
each of which is incorporated herein by reference) . 

As used herein, the term "converting shoot 
meristem to floral meristem" means promoting the 
formation of flower progenitor tissue where shoot 

15 progenitor tissue would normally be formed. As a result 
of the conversion of shoot meristem to floral meristem, 
flowers form in an angiosperm where shoots normally would 
form. The conversion of shoot meristem to floral 
meristem can be identified using well known methods, such 

20 as scanning electron microscopy, light microscopy or 
visual inspection. 

The invention also provides methods of 
converting shoot meristem to floral meristem in an 
angiosperm by introducing a first ectopically expressible 
25 nucleic acid molecule encoding a first floral meristem 
identity gene product and a second ectopically 
expressible nucleic acid molecule encoding a second 
floral meristem identity gene product into the 
angiosperm. As discussed above, first and second floral 
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meristem identity gene products useful in the invention 
can be, for example, API and CAL or API and LFY or CAL 
and LFY. 



The invention also provides methods of 
5 promoting early flowering in an angiosperm by ectopically 
expressing a nucleic acid molecule encoding a floral 
meristem identity gene product in the angiosperm, 
provided that the nucleic acid molecule is not 
ectopically expressed due to a mutation in an endogenous 

10 TERMINAL FLOWER gene. For example, the invention 
provides methods of promoting early flowering in an 
angiosperm by introducing an ectopically expressible 
nucleic acid molecule encoding a floral meristem identity 
gene product into the angiosperm, thus producing a 

15 transgenic angiosperm. A floral meristem identity gene 
product such as API, CAL or LFY, or a chimeric protein 
containing, in part, a floral meristem identity gene 
product (see below) is useful in methods of promoting 
early flowering. 



20 The present invention further provides nucleic 

acid molecules encoding floral meristem identity gene 
products. For example, the invention provides a nucleic 
acid molecule encoding CAL, having at least about 70 
percent amino acid identity with amino acids 1 to 160 of 

25 SEQ ID NO: 10 or SEQ ID NO: 11. The invention also 
provides a nucleic acid molecule encoding Arabidopais 
thalxana CAL having the amino acid sequence shown in 
Figure 5 (SEQ ID NO: 10) and a nucleic acid molecule 
encoding Braasica oleracea CAL having the amino acid 
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sequence shown in Figure 6 (SEQ ID NO: 12) . In addition, 
the invention provides a nucleic acid molecule encoding 
Brassica oleracea API having the amino acid sequence 
shown in Figure 2 (SEQ ID NO: 4) and a nucleic acid 
5 molecule encoding Brassica oleracea var. Jbotrytis API 
having the amino acid sequence shown in Figure 3 (SEQ ID 
NO: 6) • The invention also provides a nucleic acid 
molecule encoding Zea mays API having the amino acid 
sequence shown in Figure 4 (SEQ ID NO: 8) . 

10 As disclosed herein, CAL is highly conserved 

among different angiosperms. For example, Arabidopsis 
CAL (SEQ ID NO: 10) and Brassica oleracea CAL (SEQ ID NO: 
12) share about 80 percent amino acid identity. In the 
region from amino acid 1 to amino acid 160, Arabidopsis 

15 CAL and Brassica oleracea CAL are about 89 percent 

identical at the amino acid level. Using a nucleotide 
sequence derived from a conserved region of SEQ ID NO: 9 
or SEQ ID NO: 11, a nucleic acid molecule encoding a 
novel CAL ortholog can be isolated from other 

20 angiosperms- Using methods such as those described by 
Purugganan et al. ( Genetics 40: 345-356 (1995)), one can 
readily confirm that the newly isolated molecule is a CAL 
ortholog- Thus, a nucleic acid molecule encoding CAL, 
which has at least about 70 percent amino acid identity 

25 with Arabidopsis CAL (SEQ ID NO: 10) or Brassica oleracea 
CAL (SEQ ID NO: 12) , can be isolated and identified using 
well known methods. 

The invention also provides a nucleic acid 
molecule encoding a truncated CAL gene product. For 
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example, the invention provides a nucleic acid molecule 
encoding the Brassica oleracea var. botrytis CAL gene 
product (BobCAL) . BobCAL contains 150 amino acids of the 
approximately 255 amino acids encoded by a full-length 
5 CAL cDNA (see Figure 7; SEQ ID NO: 14; see, also, Figure 
8B) . 

The invention also provides a nucleic acid 
containing the Arabidopais thaliana API gene (Figure 10; 
SEQ ID NO: 17) , a nucleic acid molecule containing the 

10 Brassica oleracea API gene (Figure 11; SEQ ID NO: 18) and 
a nucleic acid molecule containing the Brasaica oleracea 
var. botrytia API gene (Figure 12; SEQ ID NO: 19) . In 
addition, the invention also provides a nucleic acid 
containing the Arabidopsis thaliana CAL gene (Figure 13; 

15 SEQ ID NO: 20) and a nucleic acid molecule containing the 
Brassica oleracea CAL gene (Figure 11; SEQ ID NO: 21) . 
In addition, the invention provides a nucleic acid 
molecule containing the Brasaica oleracea var. botrytis 
CAL gene (Figure 15; SEQ ID NO: 22) . 

20 

The invention further provides a nucleotide 
sequence that hybridizes under relatively stringent 
conditions to a nucleic acid molecule encoding a CAL, or 
a complementary sequence thereof. In particular, such a 

25 nucleotide sequence can hybridize under relatively 

stringent conditions to a nucleic acid molecule encoding 
Arabidopsis CAL (SEQ ID NO: 9) or Brasaica oleracea CAL 
(SEQ ID NO: 11), or a complementary sequence thereof. 
Similarly, the present invention provides a nucleotide 

30 sequence that hybridizes under relatively stringent 
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conditions to a nucleic acid molecule encoding Zea mays 
API (SEQ ID NO: 7) , or a complementary sequence thereof. 



In general, a nucleotide sequence that 
hybridizes under relatively stringent conditions to a 
5 nucleic acid molecule is a single-stranded nucleic acid 
sequence that can range in size from about 10 nucleotides 
to the full-length of a gene or a cDNA. Such a 
nucleotide sequence can be chemically synthesized, using 
routine methods or can be purchased from a commercial 
10 source. In addition, such nucleotide sequences can be 
obtained by enzymatic methods such as random priming 
methods, the polymerase chain reaction (PCR) or by 
standard restriction endonuclease digestion, followed by 
denaturation (Sambrook et al., supra, 1989). 

15 A nucleotide sequence that hybridizes under 

relatively stringent conditions to a nucleic acid 
molecule can be used, for example, as a primer for PCR 
(Innis et al. (ed.) PCR Protocols: A Guide to Methods and 
A pplications . San Diego, CA: Academic Press, Inc. 

20 (1990)). Such a nucleotide sequence generally contains 
about 10 to about 50 nucleotides. 



A nucleotide sequence that hybridizes under 
relatively stringent conditions to a nucleic acid 
molecule also can be used to screen a cDNA or genomic 
25 library to obtain a related nucleotide sequence. For 
example, a cDNA library that is prepared from rice or 
wheat can be screened with a nucleotide sequence derived 
from the Zea ways API sequence in order to isolate a rice 



WO 97/27287 



PCI7US96/01041 



37 

or wheat ortholog of API. Generally, such a nucleotide 
sequence contains at least about 14-16 nucleotides 
depending, for example, on the hybridization conditions 
to be used. 

5 A nucleotide sequence derived from a nucleic 

acid molecule encoding Zea mays API (SEQ ID NO: 7) also 
can be used to screen a Zea mays cDNA library to isolate 
a sequence that is related to but distinct from API. 
Furthermore, such a hybridizing nucleotide sequence can 
10 be used to analyze RNA levels or patterns of expression, 
as by northern blotting or by in situ hybridization to a 
tissue section. Such a nucleotide sequence also can be 
used in Southern blot analysis to evaluate gene structure 
and identify the presence of related gene sequences. 

One skilled in the art would select a 
particular nucleotide sequence that hybridizes under 
relatively stringent conditions to a nucleic acid 
molecule encoding a floral meristem identity gene product 
based on the application for which the sequence will be 
used. For example, in order to isolate an ortholog of 
API, one can choose a region of API that is highly 
conserved among known API sequences such as Arabidopsis 
API (SEQ ID NO: 1) and Zea mays API (GenBank accession 
number L46400; SEQ ID NO: 7). Similarly, in order to 
isolate an ortholog of GAL, one can choose a region of 
CAL that is highly conserved among known CAL cDNAs, such 
as Arabidopsis CAL (SEQ ID NO: 9) and Brassica CAL (SEQ 
ID NO: 11) . It further would be recognized, for example, 
that the region encoding the MADS domain, which is common 
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to a number of genes, can be excluded from the nucleotide 
sequence. In addition, one can use a full-length 
Arabidopsis API or CAL cDNA nucleotide sequence (SEQ ID 
NO: 1 or SEQ ID NO: 9) to isolate an ortholog of API or 
5 CAL. 

For example, the Arabidopsis API cDNA shown in 
Figure 1 (SEQ ID NO: 1) can be used as a probe to 
identify and isolate a novel API ortholog. Similarly, 
the AraJbidopsis CAL cDNA shown in Figure 5 (SEQ ID NO: 9) 

10 can be used to identify and isolate a novel CAL ortholog 
(see Examples IA and IIIC, respectively) . In order to 
identify related MADS domain genes, a nucleotide sequence 
derived from the MADS domain of API or CAL, for example, 
also can be useful to isolate a related gene sequence 

15 encoding this DNA-binding motif. 

Hybridization utilizing a nucleotide sequence 
of the invention requires that hybridization be performed 
under relatively stringent conditions such that 
non-specific hybridization is minimized. Appropriate 

20 hybridization conditions can be determined empirically, 
or can be estimated based, for example, on the relative 
G+C content of the probe and the number of mismatches 
between the probe and target sequence, if known. 
Hybridization conditions can be adjusted as desired by 

25 varying, for example, the temperature of hybridizing or 
the salt concentration (Sambrook, supra, 1989) . 

The invention also provides a vector containing 
a nucleic acid molecule encoding a CAL gene product. In 



WO 97/27287 



PCT/US96/01041 



39 

addition, the invention provides a vector containing a 
nucleic acid molecule encoding the Zea mays API gene 
product. A vector can be a cloning vector or an 
expression vector and provides a means to transfer an 
5 exogenous nucleic acid molecule into a host cell, which 
can be a prokaryotic or eukaryotic cell. Such vectors 
are well known and include plasmids, phage vectors and 
viral vectors. Various vectors and methods for 
introducing such vectors into a cell are described, for 
10 example, by Sambrook et al., supra, 1989, and by Glick 
and Thompson (eds.), Methods in Plant M olecular Biolocry 
and Biotechnology. Boca Raton, FL: CRC Press (1993), 
which is incorporated herein by reference. 

The invention also provides an expression 
vector containing a nucleic acid molecule encoding a 
floral meristem identity gene product such as CAL, API or 
LFY. Expression vectors are well known in the art and 
provide a means to transfer and express an exogenous 
nucleic acid molecule into a host cell. Thus, an 
expression vector contains, for example, transcription 
start and stop sites such as a TATA sequence and a poly-A 
signal sequence, as well as a translation start site such 
as a ribosome binding site and a stop codon, if not 
present in the coding sequence. 

25 An expression vector can contain, for example, 

a constitutive regulatory element useful for promoting 
expression of an exogenous nucleic acid molecule in a 
plant cell . The use of a constitutive regulatory element 
can be particularly advantageous because expression from 



15 



20 



WO 97/27287 



PCT/US96/01041 



40 

the element is relatively independent of developmentally 
regulated or tissue-specific factors. For example, the 
cauliflower mosaic virus 35S promoter (CaMV35S) is a 
well -characterized constitutive regulatory element that 
5 produces a high level of expression in all plant tissues 
(Odell et al. f HafcUTS 313:810-812 (1985), which is 
incorporated herein by reference) . The CaMV35S promoter 
is particularly useful because it is active in numerous 
different angiosperms (Benfey and Chua f Science 

10 250:959-966 (1990), which is incorporated herein by 
reference; Odell et al., supra, 1985). Other 
constitutive regulatory elements useful for expression in 
an angiosperm include, for example, the nopaline synthase 
(nos) gene promoter (An, Plant Physiol. 81:86 (1986), 

15 which is herein incorporated by reference) . 

In addition, an expression vector of the 
invention can contain a regulated gene regulatory element 
such as a promoter or enhancer element. A particularly 
useful regulated promoter is a tissue- specific promoter 

20 such as the shoot meristem-specif ic CDC2 promoter 

(Hemerly et al.. Plant Cell 5:1711-1723 (1993), which is 
incorporated herein by reference) , or the AGL8 promoter, 
which is active in the apical shoot meristem immediately 
after the transition to flowering (Mandel and Yanofsky, 

25 Plant Cell 7:1763-1771 (1995), which is incorporated 
herein by reference) . 

An expression vector of the invention also can 
contain an inducible regulatory element, which has 
conditional activity dependent upon the presence of a 
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particular regulatory factor. Useful inducible 
regulatory el ments include, for example, a heat -shock 
promoter (Ainley and Key, Plant Mol . Biol. 14:949 (1990), 
which is herein incorporated by reference) or a 
5 nitrate -inducible promoter derived from the spinach 
nitrite reductase gene (Back et al.. Plant Mol . 
Biol. 17:9 (1991), which is herein incorporated by 
reference) . A hormone -inducible element 
(Yamaguchi-Shinozaki et al., Plant Mol , Biol . 15:905 

10 (1990) and Kares et al., Plant Mol. Biol. 15:225 (1990), 
which are herein incorporated by reference) or a 
light -inducible promoter, such as that associated with 
the small subunit of RuBP carboxylase or the LHCP gene 
families (Feinbaum et al., Mol. Gen. Genet. 226:449 

15 (1991) and Lam and Chua, Science 248:471 (1990), which 
are herein incorporated by reference) also can be useful 
in an expression vector of the invention. A human 
glucocorticoid response element also can be used to 
achieve steroid hormone -dependent gene expression in 

20 plants (Schena et al., Proc. Nat l. Acad. Sci , USA 
88:10421 (1991), which is herein incorporated by 
reference) , 

An appropriate gene regulatory element such as 
a promotor is selected depending on the desired pattern 

25 or level of expression of a nucleic acid molecule linked 
thereto. For example, a constitutive promoter, which is 
active in all tissues, would be appropriate to express a 
desired gene product in all cells containing the vector. 
In addition, it can be desirable to restrict expression 

30 of a nucleic acid molecule to a particular tissue or 
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developmentally regulated or tissue-specific expression 
can be useful for this purpose and can avoid potential 
undesirable side -effects that can accompany unregulated 
5 expression. Inducible expression also can be 

particularly useful to manipulate the timing of gene 
expression such that, for example, a population of 
transgenic angiosperms of the invention that contain an 
expression vector comprising a floral meristem identity 
10 gene linked to an inducible promoter can be induced to 
flower essentially at the same time. Such timing of 
flowering can be useful, for example, for manipulating 
the time of crop harvest. 

The invention also provides a kit containing an 
15 expression vector having a nucleic acid molecule encoding 
a floral meristem identity gene product. Such a kit is 
useful for converting shoot meristem to floral meristem 
in an angiosperm or for promoting early flowering in an 
angiosperm. If desired, such a kit can contain 
20 appropriate reagents, which can allow relatively high 
efficiency of transformation of an angiosperm with the 
vector. Furthermore, a control plasmid lacking the 
floral meristem identity gene can be included in the kit 
to determine, for example, the efficiency of 
25 t r ans format ion . 

The invention further provides a host cell 
containing a vector comprising a nucleic acid molecule 
encoding CAL. A host cell can be prokaryotic or 
eukaryotic and can be, for example, a bacterial cell, 
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yeast cell, insect cell, xenopus cell, mammalian cell or 
plant cell. 

The invention also provides a transgenic garden 
variety cauliflower plant containing an exogenous nucleic 
5 acid molecule selected from the group consisting of a 
nucleic acid molecule encoding a CAL gene product and a 
nucleic acid molecule encoding an API gene product. Such 
a transgenic cauliflower plant can produce an edible 
flower in place of the typical cauliflower vegetable. 



10 A nucleic acid encoding CAL has been isolated 

from a Brassica oleracea line that produces wild-type 
flowers (BoCAL) and from the common garden variety of 
cauliflower, Brassica oleracea var. Jbotxytis [BobCAL) , 
which lacks flowers. The Brassica oleracea CAL cDNA (SEQ 

15 ID NO: 10) is highly similar to the Arabidopsis CAL cDNA 
(SEQ ID NO: 12; and see Figure 8). In contrast, the 
Brassica oleracea var. Jbotxytis CAL cDNA contains a stop 
codon, predicting that the BobCAL protein will be 
truncated after amino acid 150 (SEQ ID NO: 14 and see 

20 Figure 8) . The correlation of full-length Arabidopsis 
and Brassica oleracea CAL gene products with a flowering 
phenotype indicates that transformation of non-flowering 
garden varieties of cauliflower such as Brassica oleracea 
var. botrytis with a full-length CAL cDNA can induce 

25 flowering in the transgenic cauliflower plant. 



As used herein, the term "CAL gene product" 
means a full-length CAL gene product that does not 
terminate substantially before amino acid 255 and that, 
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when ectopically expressed in shoot meristem, converts 
shoot meristem to floral meristem. A nucleic acid 
molecule encoding a CAULIFLOWER gene product can be, for 
example, a nucleic acid molecule encoding Arabidopsis CAL 
5 shown in Figure 5 (SEQ ID NO: 9) or a nucleic acid 

molecule encoding Brassica oleracea CAL shown in Figure 6 
(SEQ ID NO: 11) . In comparison, a nucleic acid molecule 
encoding a truncated CAL gene product that terminates 
substantially before amino acid 255, such as the encoded 
10 truncated BobCAL gene product (SEQ ID NO: 13), is not a 
nucleic acid molecule encoding a CAL gene product as 
defined herein. Furthermore, ectopic expression of 
BobCAL in an angiosperm does not result in conversion of 
shoot meristem to floral meristem. 

15 As used herein, the term "API gene product" 

means a full-length API gene product that does not 
terminate substantially before amino acid 256. A nucleic 
acid molecule encoding an API gene product can be, for 
example, a nucleic acid molecule encoding Arabidopsis API 

20 shown in Figure 1 (SEQ ID NO: 1), Brassica oleracea API 
shown in Figure 2, (SEQ ID NO: 3), Brassica oleracea var. 
Jbotrytis API shown in Figure 3 (SEQ ID NO: 5) or Zea mays 
API shown in Figure 4 (SEQ ID NO: 7) . 

The invention provides a CAL polypeptide having 
25 at least about 70 percent amino acid identity with amino 
acids 1 to 160 of SEQ ID NO: 10 or SEQ ID NO: 12. For 
example, the Arabidopsis thaliana CAL polypeptide, having 
the amino acid sequence shown as amino acids 1 to 255 in 
Figure 5 (SEQ ID NO: 10), and the Brassica oleracea CAL 
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polypeptide, having the amino acid sequence shown as 
amino acids 1 to 255 in Figure 6 (SEQ ID NO: 12) are 
provided by the invention. 

The invention also provides the truncated 
5 Brasaica oleracea var. botrytie CAL polypeptide having 
the amino acid sequence shown as amino acids 1 to 150 in 
Figure 7 (SEQ ID NO: 14). The BobCAL polypeptide can be 
useful as an immunogen to produce an antibody that 
specifically binds the truncated BoCAL polypeptide, but 
10 does not bind a full length CAL gene product. Such an 
antibody can be useful to distinguish between a full 
length CAL and truncated CAL* 

The invention provides also provides a Zea mays 
API polypeptide. As used herein, the term "polypeptide" 

15 iB used in its broadest sense to include proteins, 

polypeptides and peptides, which are related in that each 
consists of a sequence of amino acids joined by peptide 
bonds. For convenience, the terms "polypeptide," 
"protein" and "gene product" are used interchangeably. 

20 While no specific attempt is made to distinguish the size 
limitations of a protein and a peptide, one skilled in 
the art would understand that proteins generally consist 
of at least about 50 to 100 amino acids and that peptides 
generally consist of at least two amino acids up to a few 

25 dozen amino acids. The term polypeptide is used 

generally herein to include any such amino acid sequence. 

The term polypeptide also includes an active 
fragment of a floral meristem identity gene product. As 
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used herein, the term "active fragment," means a 
polypeptide portion of a floral meristem identity gene 
product that can convert shoot meristem to floral 
meristem or can provide early flowering* For example, an 
5 active fragment of a CAL polypeptide can consist of an 
amino acid sequence derived from a CAL protein as shown 
in Figure 5 or 6 (SEQ ID NOS: 10 and 12) and that has an 
activity of a CAL. An active fragment can be, for 
example, an amino terminal or carboxyl terminal truncated 

10 form of AraJbidopsis thaliana CAL or Brassica oleracea CAL 
(SEQ ID NOS: 10 or 12, respectively) . Such an active 
fragment can be produced using well known recombinant DNA 
methods (Sambrook et al., supra, 1989). The product of 
the BobCAL gene, which is truncated at amino acid 150, 

15 lacks activity in converting shoot meristem to floral 
meristem and, therefore, is an example of a polypeptide 
portion of a CAL floral meristem identity gene product 
that is not an "active fragment." 

An active fragment of a floral meristem 
20 identity gene product can convert shoot meristem to 
floral meristem and is readily identified using the 
methods described in Example II, below). Briefly, 
Arabidopsie can be transformed with a nucleic acid 
molecule encoding a portion of a floral meristem identity 
25 gene product, in order to determine whether the fragment 
can convert shoot meristem to floral meristem or promote 
early flowering and, therefore, has an activity of a 
floral meristem identity gene product. 
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The invention further provides an antibody that 
specifically binds a CAL polypeptide, an antibody that 
specifically binds the truncated Brassica oleracea var. 
botrytis CAL polypeptide, and an antibody that 
5 specifically binds the Zea mays API polypeptide. As used 
herein, the term "antibody" is used in its broadest sense 
to include polyclonal and monoclonal antibodies, as well 
as polypeptide fragments of antibodies that retain a 
specific binding activity for CAL protein of at least 

10 about 1 x 10 s M 1 . One skilled in the art would know that 
anti-CAL antibody fragments such as Fab, Ffab 1 ^ and Fv 
fragments can retain specific binding activity for CAL 
and, thus, are included within the definition of an 
antibody. In addition, the term "antibody" as used 

15 herein includes naturally occurring antibodies as well as 
non-naturally occurring antibodies and fragments that 
have binding activity such as chimeric antibodies or 
humanized antibodies. Such non-naturally occurring 
antibodies can be constructed using solid phase peptide 

20 synthesis, produced recombinantly or obtained, for 

example, by screening combinatorial libraries consisting 
of variable heavy chains and variable light chains as 
described by Huse et al.. Science 246:1275-1281 (1989), 
which is incorporated herein by reference. 

25 An antibody "specific for" a polypeptide, or 

that "specifically binds" a polypeptide, binds with 
substantially higher affinity to that polypeptide than to 
an unrelated polypeptide. An antibody specific for a 
polypeptide also can have specificity for a related 

30 polypeptide. For example, an antibody specific for 
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Arabidopsis CAL also can have specificity for Brassica 
oleracea CAL. 



An ant i- CAL antibody, for example, can be 
prepared using a CAL fusion protein or a synthetic 
5 peptide encoding a portion of Arabidopsis CAL or of 
Brassica oleracea CAL as an immunogen. One skilled in 
the art would know that purified CAL protein, which can 
be prepared from natural sources or produced 
recombinant ly, or fragments of CAL, including a peptide 
10 portion of CAL such as a synthetic peptide, can be used 
as an immunogen. Non- immunogenic fragments or synthetic 
peptides of CAL can be made immunogenic by coupling the 
hapten to a carrier molecule such as bovine serum albumin 
(BSA) or keyhole limpet hemocyanin (KLH) . In addition, 
15 various other carrier molecules and methods for coupling 
a hapten to a carrier molecule are well known in the art 
and described, for example, by Harlow and Lane, 
Antibodie s A laboratory manual (Cold Spring Harbor 
Laboratory Press, 1988) , which is incorporated herein by 
20 reference. An antibody that specifically binds the 
truncated Bob CAL polypeptide or an antibody that 
specifically binds the Zea mays API polypeptide similarly 
can be produced using such methods. An antibody that 
specifically binds the truncated Brassica oleracea var. 
25 botrytis CAL polypeptide can be particularly useful to 
distinguish between full-length CAL polypeptide and 
truncated CAL polypeptide. 
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The invention provides a method of identifying 
a Brasaica having a modified CAL allele by detecting a 
polymorphism associated with a CAL locus, where the CAL 
locus comprises a modified CAL allele that does not 
5 encode an active CAL gene product. Such a method is 

useful for the genetic improvement of Brasaica plants, a 
genus of . great economic value. 

Brasaica plants are a highly diverse group of 
crop plants useful as vegetables and as sources of 
condiment mustard, edible and industrial oil, animal 
fodder and green manure. Braaaica crops encompass a 
variety of well known vegetables including cabbage, 
cauliflower, broccoli, collard, kale, mustard greens, 
Chinese cabbage and turnip, which can be interbred for 
crop improvement (see, for example, King, Euphytica 
50:97-112 (1990) and Crisp and Tapsell, Genetic 
improvement of vegetable crops pp. 157-178 (1993), each 
of which is herein incorporated by reference) . 

Breeding of Brasaica crops is useful, for 
20 example, for improving the quality and early development 
of vegetables. In addition, such breeding can be useful 
to increase disease resistance, such as resistance, of a 
Braaaica to clubroot disease or mildew; viral resistance, 
such as resistance to turnip mosaic virus and cauliflower 
25 mosaic virus; or pest resistance (King, supra, 1990) . 

The use of polymorphic molecular markers in the 
breeding of Braaaicae is well recognized in the art 
(Crisp and Tapsell, aupra, 1993). Identification of a 
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polymorphic molecular marker that is associated with a 
desirable trait can vastly accelerate the time required 
to breed the desirable trait into a new Braasica species 
or variant. In particular, since many rounds of 
5 backcrossing are required to breed a new trait into a 
different genetic background, early detection of a 
desirable trait by molecular methods can be performed 
prior to the time a plant is fully mature, thus 
accelerating the rate of crop breeding (see, for example, 
10 Figidore et al., Euphytica 69: 33-44 (1993), which is 
herein incorporated by reference) . 

A polymorphism associated with a CAL locus 
comprising a modified CAL allele that does not encode an 
active CAL gene product, is disclosed herein. Figure 6 
15 shows the nucleotide (SEQ ID NO: 11) and amino acid (SEQ 
ID NO: 12) sequence of Brassica oleracea CAL (BoCAL) , and 
Figure 7 shows the nucleotide (SEQ ID NO: 13) and amino 
acid (SEQ ID NO: 14) sequence of Brassica oleracea var. 
botrytis CAL (BobCAL) . At amino acid 150, which is 
20 glutamic acid (Glu) in BoCAL, a stop codon is present in 
BobCAL. This polymorphism results in a truncated BobCAL 
gene product that is not active as a floral meristem 
identity gene product. The BoCAL nucleic acid sequence 
(ACGAGT) can be readily distinguished from the BobCAL 
25 nucleic acid sequence (ACTAGT) using well known molecular 
methods. For example, the polymorphic ACTAGT BobCAL 
sequence is recognized by a Spel restriction endonuclease 
site, whereas the ACGAGT BoCAL sequence is not recognized 
by Spel. Thus, a restriction fragment length 
30 polymorphism (RFLP) in BobCAL provides a simple means for 
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therefore, can serve as a marker to predict the 
inheritance of the "cauliflower" phenotype. 

A modified CAL allele encoding a truncated CAL 
5 gene product also can serve as a marker to predict the 
"cauliflower" phenotype in other cauliflower variants. 
For example, nine romanesco variants of Brassica oleracea 
var. botrytis, which each have the "cauliflower" 
phenotype, were examined for the presence of a stop codon 

10 at position 151 of the CAL coding sequence. All nine of 
the romanesco variants contained the Spel site that 
indicates a stop codon and, thus, a truncated CAL gene 
product. In contrast, Brassica oleracea variants that 
lack the "cauliflower" phenotype (broccoli and brussels 

15 sprouts) were examined for the Spel site. In every case, 
the broccoli and brussel sprout variants had a 
full-length CAL coding sequence, as indicated by the 
absence of the distinguishing Spel site. Thus, a 
truncated CAL gene product can be involved in the 

20 "cauliflower phenotype" in numerous different Brassica 
variants . 



As used herein, the term "modified CAL allele" 
means a CAL allele that does not encode a CAL gene 
product active in converting shoot meristem to floral 
25 meristem. A modified CAL allele can have a modification 
within a gene regulatory element such that a CAL gene 
product is not produced. In addition, a modified CAL 
allele can have a modification such as a mutation, 
deletion or insertion in a CAL coding sequence which 
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results in an inactive CAL gene product. For example, an 
inactive CAL gene product can result from a mutation 
creating a stop codon, such that a truncated, inactive 
CAL gene product lacking the ability to convert shoot 
5 meristem to floral meristem is produced. 

As used herein, the term "associated" means 
closely linked and describes the tendency of two genetic 
loci to be inherited together as a result of their 
proximity. If two genetic loci are associated and are 

10 polymorphic, one locus can serve as a marker for the 
inheritance of the second locus. Thus, a polymorphism 
associated with a CAL locus comprising a modified CAL 
allele can serve as a marker for inheritance of the 
modified CAL allele. An associated polymorphism can be 

15 located in proximity to a CAL gene or can be located 
within a CAL gene. 

A polymorphism in a nucleic acid sequence can 
be detected by a variety of methods. For example, if the 
polymorphism occurs in a particular restriction 

20 endonuclease site, the polymorphism can be detected by a 
difference in restriction fragment length observed 
following restriction with the particular restriction 
endonuclease and hybridization with a nucleotide sequence 
that is complementary to a nucleic acid sequence 

25 including a polymorphism. 

The use of restriction fragment length 
polymorphism as an aid to breeding BraBaicae is well 
known in the art (see, for example, Slocum et al., Theor , 
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APPl. genet, 80:57-64 (1990); Kennard et al., Theor. 
Aool. Genet. 87:721-732 (1994); and Figidore et al . , 
aupra, 1993, each of which is herein incorporated by 
reference) . A restriction endonuclease such as Spel, 
5 which is useful for identifying the presence of a BobCAL 
allele in an angiosperm, is readily available and can be 
purchased from a commercial source. Furthermore, a 
nucleotide sequence that is complementary to a nucleic 
acid sequence having a polymorphism associated with a CAL 

10 locus comprising a modified CAL allele can be derived, 
for example, from the nucleic acid molecule encoding 
Braaaica oleracea var. Jbotrytis CAL shown in Figure 7 
(SEQ ID NO: 13) or from the nucleic acid molecule 
encoding Braaaica oleracea CAL shown in Figure 6 (SEQ ID 

15 NO: 11) . 

In some cases, a polymorphism is not 
distinguishable by a RFLP f but nevertheless can be used 
to identify a Braeaica having a modified CAL allele. For 
example, the polymerase chain reaction (PCR) can be used 
to detect a polymorphism associated with a CAL locus 
comprising a modified CAL allele. Specifically, a 
polymorphic region of a modified allele can be 
selectively amplified by using a primer that matches the 
nucleotide sequence of one allele of a polymorphic locus, 
but does not match the sequence of the second allele 
(Sobral and Honeycutt, The Polymerase rhaH n Reactinn . pp. 
304-319 (1994) , which is herein incorporated by 
reference) . Other well-known approaches for analyzing a 
polymorphism using PCR include discriminant hybridization 
of PCR-amplified DNA to allele-specif ic oligonucleotides 
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and denaturing gradient gel electrophoresis (see Innis et 
al. , supra, 1990) . 



The invention further provides a nucleic acid 
molecule encoding a chimeric protein, comprising a 
5 nucleic acid molecule encoding a floral meristem identity 
gene product such as API, LFY or CAL operably linked to a 
nucleic acid molecule encoding a ligand binding domain. 
Expression of a chimeric protein of the invention in an 
angiosperm is particularly useful because the ligand 

10 binding domain confers regulatable activity on a gene 
product such as a floral meristem identity gene product 
to which it is fused. Specifically, the floral meristem 
identity gene product component of the chimeric protein 
is inactive in the absence of the particular ligand, 

15 whereas, in the presence of ligand, the ligand binds the 
ligand binding domain, resulting in floral meristem 
identity gene product activity. 



A nucleic acid molecule encoding a chimeric 
protein of the invention contains a nucleic acid molecule 

20 encoding a floral meristem identity gene product, such as 
a nucleic acid molecule encoding the amino acid sequence 
shown in Figure 1 (SEQ ID NO: 2), in Figure 5 (SEQ ID NO: 
10), or in Figure 9 (SEQ ID NO: 10), either of which is 
operably linked to a nucleic acid molecule encoding a 

25 ligand binding domain. The expression of such a nucleic 
acid molecule results in the production of a chimeric 
protein comprising a floral meristem identity gene 
product fused to a ligand binding domain. Thus, the 
invention also provides a chimeric protein comprising a 
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floral meristem identity gene product fused to a ligand 
binding domain. 

A ligand binding domain useful in a chimeric 
protein of the invention can be a steroid binding domain 
5 such as the ligand binding domain of a glucocorticoid 
receptor, estrogen receptor, progesterone receptor, 
androgen receptor, thyroid receptor, vitamin D receptor 
or retinoic acid receptor. A particularly useful ligand 
binding domain is a glucocorticoid receptor ligand 
10 binding domain, encompassed, for example, within amino 
acids 512 to 795 of the rat glucocorticoid receptor as 
shown in Figure 16 (SEQ ID NO: 24; Miesfeld et al. # Cell 
46:389-399 (1986), which is incorporated herein by 
reference) . 



15 A chimeric protein containing a ligand binding 

domain, such as the rat glucocorticoid receptor ligand 
binding domain, confers glucocorticoid- dependent activity 
on the chimeric protein. For example, the activity of 
chimeric proteins consisting of adenovirus E1A, c-myc, 

20 c-fos, the HIV-l Rev transact i vat or , MyoD or maize 
regulatory factor R fused to the rat glucocorticoid 
receptor ligand binding domain is regulated by 
glucocorticoid hormone (Eilers et al.. Nature 340:66 
(1989); Superti-Furga et al. f Proc. Nat l. Acad. SrA , , 

25 U . S . A. 88:5114 (1991); Hope et al., Proc. Natl. 

Sci. . U.S.A. 87:7787 (1990); Hollenberg et al., Proc. 
Natl. Acad. Sri.. U.S.A. 90:8028 (1993), each of which is 
incorporated herein by reference) . 
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Such a chimeric protein also can be regulated 
in plants. For example, a chimeric protein containing a 
heterologous protein fused to a rat glucocorticoid 
receptor ligand binding domain (amino acids 512 to 795) 
5 was expressed under the control of the constitutive 
cauliflower mosaic virus 35S promoter in ArabidopBis. 
The activity of the chimeric protein was inducible; the 
chimeric protein was inactive in the absence of ligand, 
and became active upon treatment of transformed plants 

10 with a synthetic glucocorticoid, dexamethasone (Lloyd et 
al., Science 266:436-439 (1994), which is incorporated 
herein by reference) . As disclosed herein, a ligand 
binding domain fused to a floral meristem identity gene 
product can confer ligand inducibility on the activity of 

15 a fused floral meristem identity gene product in plants 
such that, upon exposure to a particular ligand, the 
floral meristem identity gene product is active. 

Methods for constructing a nucleic acid 
molecule encoding a chimeric protein are routine and well 

20 known in the art (Sambrook et al., supra, 1989). For 

example, the skilled artisan would recognize that a stop 
codon in the 5' nucleic acid molecule must be removed and 
that the two nucleic acid molecules must be linked such 
that the reading frame of the 3' nucleic acid molecule is 

25 preserved. Methods of transforming plants with nucleic 
acid molecules also are well known in the art (see, for 
example, Mohoney et al., U.S. patent number 5,463,174, 
and Barry et al., U.S. patent number 5,463,175, each of 
which is incorporated herein by reference) . 
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As used herein, the term "operably linked, " 
when used in reference to two nucleic acid molecules 
comprising a nucleic acid molecule encoding a chimeric 
protein, means that the two nucleic acid molecules are 
5 linked in frame such that a full-length chimeric protein 
can be expressed. In particular, the 5 f nucleic acid 
molecule, which encodes the amino- terminal portion of the 
chimeric protein, must be linked to the 3' nucleic acid 
molecule, which encodes the carboxyl -terminal portion of 
10 the chimeric protein, such that the carboxyl- terminal 
portion of the chimeric protein iB produced in the 
correct reading frame . 

The invention further provides a transgenic 
angiosperm containing a nucleic acid molecule encoding a 

15 chimeric protein, comprising a nucleic acid molecule 

encoding a floral meristem identity gene product such as 
API, CAL or LFY linked to a nucleic acid molecule 
encoding a ligand binding domain. Such a transgenic 
angiosperm is particularly useful because the angiosperm 

20 can be induced to flower by contacting the angiosperm 
with a ligand that binds the ligand binding domain. 
Thus, the invention provides a method of promoting early 
flowering or of converting shoot meristem to floral 
meristem in a transgenic angiosperm containing a nucleic 

25 acid molecule encoding a chimeric protein of the 
invention, comprising expressing the nucleic acid 
molecule encoding the chimeric protein in the angiosperm, 
and contacting the angiosperm with a ligand that binds 
the ligand binding domain, wherein binding of the ligand 

30 to the ligand binding domain activates the floral 
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invention provides methods of promoting early flowering 
or of converting shoot meristem to floral meristem in a 
transgenic angiosperm containing a nucleic acid molecule 
5 encoding a chimeric protein that consists of a nucleic 
acid molecule encoding API or CAL or LFY linked to a 
nucleic acid molecule encoding a glucocorticoid receptor 
ligand binding domain by contacting the transgenic 
angiosperm with a glucocorticoid such as dexamethasone. 

10 As used herein, the term "ligand" means a 

naturally occurring or synthetic chemical or biological 
molecule such as a simple or complex organic molecule, a 
peptide, a protein or an oligonucleotide that 
specifically binds a ligand binding domain. A ligand of 

15 the invention can be used, alone, in solution or can be 
used in conjunction with an acceptable carrier that can 
serve to stabilize the ligand or promote absorption of 
the ligand by an angiosperm. 

One skilled in the art can readily determine 
20 the optimum concentration of ligand needed to bind a 
ligand binding domain and render a floral meristem 
identity gene product active. Generally, a concentration 
of about 1 nM to IjiM dexamethasone is useful for 
activating floral meristem identity gene product activity 
25 in a chimeric protein comprising a floral meristem 
identity gene product and a glucocorticoid receptor 
ligand binding domain (Lloyd et al . , supra, 1994). 
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A transgenic angiosperm expressing a chimeric 
protein of the invention can be contacted with ligand in 
a variety of manners including, for example, by spraying, 
injecting or immersing the angiosperm. Further, a plant 
5 may be contacted with a ligand by adding the ligand to 
the plant's water supply or to the soil, whereby the 
ligand is absorbed into the angiosperm. 

The following examples are intended to 
10 illustrate but not limit the present invention. 

EXAMPLE I 

Identification and characterization of the 
Zea mavs APETALA1 cDNA 

This example describes the isolation and 
15 characterization of the Zea mays ZAP-l "gene", which is 
an ortholog of the Arabidopsis floral meristem identity 
gene , API . 

Identification and characterization of a nuc leic acid 

sequence encoding ZAP-l 

20 The utility of using a cloned floral homeotic 

gene from Arabidopais to identify the putative ortholog 
in maize has previously been demonstrated (Schmidt et 
al., supra, (1993), which is incorporated herein by 
reference). As described in Mena et al. ( Plant J. 

25 8(6):845-854 (1995)), the maize ortholog of the 

Arabidopsis API floral meristem identity gene, was 
isolated by screening a Zea mays ear cDNA library using 
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the Arabidopsis API cDNA (SEQ ID NO: 1) as a probe, A 
cDNA library was prepared from wild-type immature ears as 
described by Schmidt et al., supra, 1993 , using an 
ArabidopsiB API cDNA sequence as a probe. The 
5 Arabidopsis API cDNA (SEQ ID NO: 1), which is shown in 
Figure 1 (SEQ ID NO 1), was used as the probe. 
Low- stringency hybridizations with the API probe were 
conducted as described previously for the isolation of 
ZAG1 using the AG cDNA as a probe (Schmidt et al., supra, 
10 1993) . Positive plaques were isolated and cDNAs were 
recovered in Bluescript by in vivo excision. 
Double -stranded sequencing was performed using the 
Sequenase Version 2.0 kit (U.S. Biochemical, Cleveland, 
Ohio) according to the manufacturer's protocol. 

15 The cDNA sequence and deduced amino acid 

sequence for ZAP1 are shown in Figure 4 (SEQ ID NOS: 7 
and 8) . The deduced amino acid sequence for ZAP1 shares 
89% identity with Arabidopsis API through the WADS domain 
(amino acids 1 to 57) and 70% identity through the first 

20 160 amino acids, which includes the K domain. The high 
level of amino acid sequence identity between ZAP1 and 
API (SEQ ID NOS: 8 and 2), as well as the expression 
pattern of ZAP1 in maize florets (see below) , indicates 
that ZAP1 is the maize ortholog of Arabidopsis API. 

25 B. RNA expression pattern of ZAP1 

Total RNA was isolated from different maize 
tissues as described by Cone et al., Proc. Natl. Acad. 
Sci.. USA 83:9631-9635 (1986), which is herein 
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incorporated by reference. RNA was prepared from ears or 
tassels at early developing stages (approximately 2 cm in 
size), husk leaves from developing ear shoots, shoots and 
roots of germinated seedlings, leaves from 2 to 3 week 
5 old plants and endosperm, and embryos at 18 days after 
pollination. Mature floral organs were dissected from 
ears at the time of silk emergence or from tassels at 
several days pre -emergence. To study expression patterns 
in the mature female flower, carpels were isolated and 
10 the remaining sterile organs were pooled and analyzed 
together. In the same way, stamens were dissected and 
collected from male florets and the remaining organs 
(excluding the glumes) were pooled as one sample. 

RNA concentration and purity was determined by 
15 absorbance at 260/280 nM, and equal amounts (10 jzg) were 
fractionated on formaldehyde -agarose gels. Gels were 
stained in a solution of 0.125 ^tg ml* 1 acridine orange to 
confirm the integrity of the RNA samples and the 
uniformity of gel loading, then RNA was blotted on to 
20 Hybond-N® membranes (Amersham International, Arlington 
Heights, Illinois) according to the manufacturer's 
instructions. Prehybridization and hybridization 
solutions were prepared as previously described (Schmidt 
et al., Science 238:960-963 (1987), which is incorporated 
25 herein by reference) . The probe for ZAP1 RNA expression 
studies was a 445 bp Sacl-Nsil fragment from the 3' end 
of the cDNA. Southern blot analyses were conducted to 
establish conditions for specific hybridization of this 
probe. No cross -hybridization was detected with 



WO 97/27287 



PCT/US96/01041 



62 

hybridization at 60°C in 50% formamide and washes at 65°C 
in O.lx SSC and 0.5% SDS. 



The strong sequence similarity between ZAP1 and 
API indicated that ZAP1 was the ortholog of this 
5 AraJbidqpsis floral meristem identity gene. As a first 
approximation of whether the pattern of ZAP1 expression 
paralleled that of API, a blot of total RNA from 
vegetative and reproductive organs was hybridized with a 
gene-specific fragment of the ZAP1 cDNA (nucleotides 370 

10 to 820 of SEQ ID NO: 7) . ZAP1 RNA was detected only in 
male and female inflorescences and in the husk leaves 
that surround the developing ear. No ZAP1 RNA expression 
was detectable in RNA isolated from root, shoot, leaf, 
endosperm, or embryo tissue. The restriction of ZAP1 

15 expression to terminal and axillary inflorescences is 

consistent with ZAP1 being the Arabidopaia API ortholog. 

Male and female florets were isolated from 
mature inflorescences, and the reproductive organs were 
separated from the remainder of the floret. RNA was 

20 isolated from the reproductive and the sterile portions 
of the florets. ZAP1 RNA expression was not detected in 
maize stamens or carpels, whereas high levels of ZAP1 
RNA were present in developing ear and tassel florets 
from which the stamens and carpels had been removed. 

25 Thus, the exclusion of ZAP2 expression in stamens and 
carpels and its inclusion in the RNA of the 
non- reproductive portions of the floret (lodicules, lemma 
and palea) is similar to the pattern of expression of API 
in flowers of Arabidopaia. 



WO 97/27287 



PCT/US96/01041 



63 



EXAMPLE II 

Conversion of shoot meristem to floral tneriatero in an 
APETAIA1 transgenic plant 

This example describes methods for producing a 
5 transgenic Arabidopsis plant, in which shoot meristem is 
converted to floral meristem. 

Ectopic expression Of APETALA1 converts inf lorescenc* 

shoots into flowers 

Transgenic plants that const i tut ively express 
10 API from the cauliflower mosaic virus 35S (CaMV35S) 

promoter were produced to determine whether ectopic API 
expression could convert shoot meristem to floral 
meristem. The API coding sequence was placed under 
control of the cauliflower mosaic virus 35S promoter 
15 (Odell et al., supra, 1985) as follows. BamHI linkers 
were ligated to the Hindi site of the full-length API 
complementary DNA (Mandel et al., supra, (1992), which is 
incorporated herein by reference) in pAM116, and the 
resulting BamHI fragment was fused to the cauliflower 
20 mosaic virus 35S promoter (Jack et al., Cell 76:703-716 
(1994) , which is incorporated herein by reference) in 
pCGN18 to create pAM563. 

Transgenic API Arabidopsis plants of the 
Columbia ecotype were generated by selecting 
25 kanamycin-resistant plants after Agrobacterium-mediated 
plant transformation using the in planta method (Bechtold 
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et al., C.R. Acad. Sci , Paris 316:1194-1199 (1993), which 
is incorporated herein by reference) . All analyses were 
performed in subsequent generations. Approximately 120 
independent transgenic lines that displayed the described 
5 phenotypes were obtained. 

Remarkably, in 35S-AP1 transgenic plants, the 
normally indeterminate shoot apex ) prematurely 
terminated as a floral meristem and formed a terminal 
flower. In addition, all lateral meristems that normally 

10 would produce inflorescence shoots also were converted 
into solitary flowers. These results demonstrate that 
ectopic expression of API in shoot meristem is sufficient 
to convert shoot meristem to floral meristem, even though 
API normally is not absolutely required to specify floral 

15 meristem identity. 

B - T.RAFY ia not requir ed for thft conversion of 
infloresce nce shoots to flowers in an APETALA1 
franR ? enic olant 

To determine whether the 35S-AP1 transgene 
20 causes ectopic LFY activity, and whether ectopic LFY 

activity is required for the conversion of shoot meristem 
to floral meristem, the 3 5S- API transgene was introduced 
into Arabidopsis lfy mutants. The 35S-AP2 transgene was 
crossed into the strong lfy- 6 mutant background and the F 2 
25 progeny were analyzed. 
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Lfy mutant plants containing the 35S-AP1 
transgene displayed the same conversion of apical and 
lateral shoot meristem to floral meristem as was observed 
in transgenics containing wild type LFY. However, the 
5 resulting flowers had the typical lfy mutant phenotype, 
in which floral organs developed as sepaloid and 
carpelloid structures, with an absence of petals and 
stamens. These results demonstrate that LFY is not 
required for the conversion of shoot meristem to floral 
10 meristem in a transgenic angiosperm that ectopically 
expresses API. 

£L APETALA1 is not sufficient tp specify organ fate 



As well as being involved in the early step of 
specifying floral meristem identity, API also is involved 

15 in specifying sepal and petal identity at a later stage 
in flower development. Although API RNA is initially 
expressed throughout the young flower primordium, it is 
later excluded from stamen and carpel primordia (Mandel 
et al., Nature 360:273-277 (1992)). Since the 

20 cauliflower mosaic virus 35S promoter is active in all 
floral organs, 35S-AP1 transgenic plants are likely to 
ectopically express API in stamens and carpels. However, 
35S-AP1 transgenic plants had normal stamens and carpels, 
indicating that API is not sufficient to specify sepal 

25 and petal organ fate. 
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D. Ectopic saaaraaaion of AEEZALAi causes early flowering 

In addition to its ability to alter 
inflorescence meristem identity, ectopic expression of 
API also influences the vegetative phase of plant growth. 
5 Wild-type plants have a vegetative phase during which a 
basal rosette of leaves is produced, followed by the 
transition to reproductive growth. The transition from 
vegetative to reproductive growth was measured both in 
terms of the number of days post -germination until the 

10 first visible flowers were observed, and by counting the 
number of leaves. Under continuous light, wild- type and 
35S-AP2 transgenic plants flowered after producing 9.88 ± 
1.45 and 4.16±0.97 leaves, respectively. Under short-day 
growth conditions (8 hours light, 16 hours dark, 24 C) , 

15 wild-type and 35S-AP1 transgenic plants flowered after 
producing 52.42±3.47 and 7.4±1.18 leaves, respectively. 

In summary, under continuous light growth 
conditions, flowers appear on wild- type Arabidopais 
plants after approximately 18 days, whereas the 35S-AP1 

20 transgenic plants flowered after an average of only 10 
days. Furthermore, under short-day growth conditions, 
flowering is delayed in wild-type plants until 
approximately 10 weeks after germination, whereas, 35S- 
AP1 transgenic plants flowered in less than 3 weeks. 

25 Thus, ectopic API activity significantly reduced the time 
to flowering and reduced the delay of flowering caused by 
short day growth conditions. 
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EXAMPLE ZZI 

Isolation and characterization of the Arabidopsis and 
Brassica oleracea CAULIFLOWER genes 

This example describes methods for isolating 
5 and characterizing the Arabidopsis and Brassica oleracea 
CAL genes. 

Isolation of the Arabidopsis and Brassing oJeracea 

CAULIFLOWER genes 

Genetic evidence that CAL and API proteins may 
10 be functionally related indicated that these proteins may 
share similar DNA sequences. In addition, DNA blot 
hybridization revealed that the Arabidopsis genome 
contains a gene that is closely related to API. The CAL 
gene, which is closely related to API, was isolated and 
15 identified as a member of the family of Arabidopsis MADS 
domain genes known as the AGAMOUS- 1 ike (AGL) genes. 

Hybridization with an API probe was used to 
isolate a 4.8-kb Eco RI genomic fragment of CAL. The 
corresponding CAL complementary DNA (pBS85) was cloned by 
20 reverse transcription-polymerase chain reaction (RT-PCR) 
with the oligonucleotides AGL10-1 

(5 1 -GATCGTCGTTATCTCTCTTG-3 1 ; SEQ ID NO: 25) and AGL1 0-12 
(5 1 -GTAGTCTATTCAAGCGGCG-3 1 ; SEQ ID NO: 26). 

The Arabidopsis CAL cDNA encodes a putative 255 
25 amino acid protein (Figure 5; SEQ ID NO: 10) having a 

calculated molecular weight of 30.1 kD and an isoelectric 
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point of 8.78. The deduced amino acid sequence for CAL 
contains a MADS domain which generally is present in a 
class of transcription factors. The MADS domains of CAL 
and API were markedly similar, differing in only 5 of 56 
5 amino acid residues, 4 of which represent conservative 
replacements. Overall, the putative CAL protein is 76% 
identical to API; with allowance for conservative amino 
acid substitutions, the two proteins are 88% similar. 
These results indicate that CAL and API may recognize 
10 similar target sequences and regulate many of the same 
genes involved in floral meristems identity. 

CAL was mapped to the approximate location of 
the loci identified by classical genetic means for the 
cauliflower phenotype (Bowman et al., Development 119:721 

15 (1993) , which is herein incorporated by reference) . 

Restriction fragment length polymorphism (RFLP) mapping 
filters were scored and the results analyzed with the 
Macintosh version of the Mapmaker program as described by 
Rieter et al., fPror. Natl. Acad. Sci.. USA, 89:1477 

20 (1992), which is herein incorporated by reference). The 
results localized CAL to the upper arm of chromosome 1, 
near marker X235. 

A genomic fragment spanning the CAL gene was 
used to transform cal-1 apl-1 plants. A 5850 -bp Bam HI 
25 fragment containing the entire coding region of the 

AraJbidopsis CAL gene as well as 1860 bp upstream of the 
putative translational start site was inserted into the 
pBIN19 plant transformation vector (Clontech, Palo Alto, 
California) and used for transformation of root tissue 
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from cal-1 apl-1 plants as described by Valvekens et al. 
(Proc. Natl. Acad. Sci . . USA 85:5536 (1988), which is 
incorporated herein by reference) . Seeds were harvested 
from primary transformants, and all phenotypic analyses 
5 were performed in subsequent generations. Four 
independent lines transformed with CAL showed a 
complementation of the cauliflower leal) phenotype and 
displayed a range of phenotypes similar to those 
exhibited by apl mutants. These results demonstrated 
10 that CAL functions to convert shoot meristem to floral 
meristera. 



In order to identify regions of functional 
importance in the CAL protein, cal mutants were generated 
and analyzed. The cal alleles were isolated by 

15 mutagenizing seeds homozygous for the apl-1 allele in Ler 
with 0.1% or 0.05% ethylmethane sulfonate (EMS) for 16 
hours. Putative new cal alleles were crossed to cal -2 
apl-1 chlorina plants to verify allelism. Two sets of 
oligonucleotides were used to amplify and clone new 

20 alleles: oligos AGL10-1 (SEQ ID NO: 25) and AGL10-2 
(5 ■ - GATGGAGACCATTAAACAT - 3 ; SEQ ID NO: 27) for the 5' 
portion and oligos AGL10-3 ( 5 1 -GGAGAAGGTACTAGAACG- 3 1 ; SEQ 
ID NO: 28) and AGL10-4 (5 1 -GCCCTCTTCCATAGATCC-3 1 ; SEQ ID 
NO: 29) for the 3' portion of the gene. All coding 

25 regions and intron-exon boundaries of the mutant alleles 
were sequenced. 



Sequence analysis of the cal-1 allele, which 
exists in the wild-type Wassilewskija (WS) ectoype, 
revealed a cluster of three amino acid differences in the 
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seventh exon, relative to the wild- type gene product from 
Landsberg erecta (Ler) (Figure 8) . One or more of these 
amino acid differences can be responsible for the cal 
phenotype, because the cal -2 gene was expressed normally 
5 and the transcribed RNA was correctly spliced in the WS 
background. The three additional cal alleles that were 
isolated, designated cal -2, cal -3, and cal -4, exhibited 
phenotypes similar to that of the cal-1 allele. 

Sequence analyses revealed a single missense 
10 mutation for each (Figure 8) . Since mutations in the 
cal -2 and cal-3 'alleles lie in the MADS domain, these 
mutations can affect the ability of CAL to bind DNA and 
activate its target genes. Because the cal-4 allele 
contains a substitution in the K domain, a motif thought 
15 to be involved in protein-protein interactions, this 

mutation can affect the ability of CAL to form homodimers 
or to interact with other proteins such as API. 

B i expression pattern of CAULIFLOWER 

To characterize the temporal and spatial 
20 pattern of CAL RNA accumulation, RNA in situ 

hybridizations were performed using a CAL-specific probe. 
3S S- labeled antisense CAL and BoCAL mRNA was synthesized 
from Sea 1 -digested cDNA templates and hybridized to 8 
sections of Arabidppsis Ler or Brassica oleracea 
25 inflorescences. The probes did not contain any MADS box 
sequences in order to avoid cross-hybridization with 
other MADS box genes. Hybridization conditions were as 
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previously described (Drews et al . , Cell 65:991 (1991), 
which is herein incorporated by reference) . 

As with API, CAL RNA accumulated in young 
flower primordia, consistent with the ability of CAL to 
5 substitute for API in specifying floral meristems . In 
contrast to API RNA, however, which accumulated at high 
levels throughout sepal and petal development, CAL RNA 
was detected only at very low levels in these organs. 
These results demonstrate that CAL was unable to 
10 substitute for API in specifying sepals and petals, at 

least in part as a result of the relatively low levels of 
CAL RNA in these developing organs. 

£L Molecular Basis of the cauliflower nhfinnfypp 

The cal phenotype in AraJbidopsis is similar to 
the inflorescence structure that develops in the closely 
related species Brassica oleracea var. Jbotrytis, the 
cultivated garden variety of cauliflower, indicating that 
the CAL gene can contribute to the cal phenotype of this 
agriculturally important species. Thus, CAL gene 
homologs were isolated from a Brassica oleracea line that 
produces wild-type flowers (BoCAL) and from the common 
garden variety of cauliflower Brassica oleracea var. 
botzytis (BoJbCAL) . 

The single -copy BobCAL gene (Snowball Y 
25 Improved, NK Lawn & Garden, Minneapolis, MN) was isolated 
from a size -selected genomic library in XBlueStar 
(Novagen) on a 16-kbp BamHI fragment with the Arabidopsis 
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CAL gene as a probe* The BoCAL gene was isolated from a 
rapid cycling line (Williams and Hill, Science 232:1385 
(1986)) by PCR on both RNA and genomic DNA. The cDNA was 
isolated by RT-PCR using the oligonucleotides: Bobl 
5 ( 5 1 - TCT ACGAGAAATGGGAAGG - 3 ' ; SEQ ID NO: 30) and Bob2 
(5 ' -GTCGATATATGGCGAGTCC-3 1 ; SEQ ID NO: 31). The 5» 
portion of the gene was obtained using oligonucleotides 
Bob 1 (SEQ ID NO: 30) and Bob4B 

(5 ' -CCATTGACCAGTTCGTTTG-3 1 ; SEQ ID NO: 32). The 3' 
10 portion was obtained using oligonucleotides Bob3 

(5 , -GCTCCAGACTCTCACGTC-3 , ; SEQ ID NO: 33) and Bob2 (SEQ 
ID NO: 31) . 

RNA in situ hybridizations were performed to 
determine the expression pattern of BoCAL gene from 
15 Brassica oleracea. As in Arabidopsis, BoCAL RNA 

accumulated uniformly in early floral primordia and later 
was excluded from the cells that give rise to stamens and 
carpels . 

DNA sequence analyses revealed that the open 
20 reading frame of the BoCAL gene is intact, whereas that 
of the BobCAL gene is interrupted by a stop codon in 
exon 5 (Figure 8) . Translation of the resulting BobCAL 
protein product is truncated after only 150 of the 
wild-type 255 amino acids. Because similar stop codon 
25 mutations in the fifth exon of the Arabidopsis API coding 
sequence result in plants having a severe apl phenotype, 
the BobCAL protein likely is not functional. These 
results indicate that, as in Arabidopsis, the molecular 
basis for the cauliflower phenotype in Brassica oleracea 



WO 97/27287 



PCT/US96/01041 



73 

var. botrytia is due, at least in part, to a mutation in 
the BobCAL gene. 

EXAMPLE IV 

Conversion of inflorescence shoots in to flnwp.rs in an 
5 CAULIFLOWER transgenic plant 

This example describes methods for producing a 
transgenic CAL plant. 

Ectopic expression of CAT7T.TVL0WER converts 

inflorescence shoots to flowers 

10 Transgenic AraJbidopsis plants that ectopically 

express CAL in shoot meristem were generated. The 
full-length CAL cDNA was inserted downstream of the 35S 
cauliflower mosaic virus promoter in the EcoRI of pMON530 
{Monsanto Co. Co., St. Louis, Missouri) This plasmid was 

15 introduced into Agrobacterium strain ASE (check) and used 
to transform the Columbia ecotype of AraJbidopsis using a 
modified vacuum infiltration method described by Bechtold 
et al. (supra, 1993). The 96 lines generated that 
harbored the 35S-CAL construct had a range of weak to 

20 strong phenotypes. The transgenic plants with the 

strongest phenotypes (27 lines) closely resembled the tfl 
mutant „ 

35S-CAL transgenic plants had converted apical 
and lateral inflorescence shoots into flowers and showed 
25 an early flowering phenotype. These results demonstrate 
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that CAL is sufficient for the conversion of shoots to 
flowers and for promoting early flowering. 

EXAMPLE V 

Conversion of shoots into flowers in a 

5 LEAFY transgenic plant 

This example describes methods for producing a 
transgenic LFY Arabidopsis and aspen. 

ft, fnnvPTsion of Arabidopsis shoots bv LEAFY 

Transgenic Arabidopsis plants were generated by 
transforming Arabidopsis with LFY under the control of 
the cauliflower mosaic virus 35S promoter (CaMV35S) (Odell 
et al., supra, (1985)). A LFY complementary cDNA (Weigel 
et al, Cell 69:843-859 (1992), which is incorporated 
herein by reference) was inserted into a T-DNA 
transformation vector containing a CaMV 35S promoter/3' 
nos cassette (Jack et al., supra, 1994). Transformed 
seedlings were selected for kanamycin resistance. 
Several hundred transf ormants in three different genetic 
backgrounds (Nossen, Wassilewskija and Columbia) were 
recovered and several lines were characterized in detail. 

High levels of LFY RNA expression were detected 
by northern blot analysis. In general, Nossen lines had 
weaker phenotypes, especially when grown in short days. 
The 35S-LFY transgene of line DW151.117 (ecotype 
25 Wassilewskija) was introgressed into the erecta 

background by backcrossing to a Landsberg erecta strain. 
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Plants were grown under 16 hours light and 8 hours dark. 
The 35S-LFY transgene provided at least as much LFY 
activity as the endogenous gene and completely suppressed 
the Ify mutant phenotype when crossed into the background 
5 of the lfy-6 null allele. 

Most 35S-LFY transgenic plants lines 
demonstrated a very similar, dominant and heritable 
phenotype. Secondary shoots that arose in lateral 
positions were consistently replaced by solitary flowers, 

10 and higher-order shoots were absent. Although the number 
of rosette leaves was unchanged from the wild type, 
35S-LFY plants flowered earlier than wild type; the 
solitary flowers in the axils of the rosette leaves 
developed and opened precociously. In addition, the 

15 primary shoot terminated with a flower. In the most 

extreme cases, a terminal flower was formed immediately 
above the rosette. This gain of function phenotype 
(conversion of shoots to flowers) is the opposite of the 
Ify loss of function phenotype (conversion of flowers to 

20 shoots) . These results demonstrate that LFY encodes a 
developmental switch that is both sufficient and 
necessary to convert shoot meristem to flower meristem. 

The effects of constitutive LFY expression 
differ for primary and secondary shoot meristems. 
25 Secondary meristems were transformed into flower 
meristem, apparently as soon as it developed, and 
produced only a single, solitary flower. In contrast, 
primary shoot meristem produced leaves and lateral 
flowers before being consumed in the formation of a 
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terminal flower. These developmental differences 
indicate that a meristem must acquire competence to 
respond to the activity of a floral meristem identity 
gene such as LFY. 

5 Conver sion of aanen shoots bv LEAFY 

Given that constitutive expression of LFY 
induced precocious flowering during the vegetative phase 
of ArabidopaiB, the effect of LFY on the flowering of 
other species was examined. The perennial tree, hybrid 
aspen, is derived from parental species that flower 
naturally only after 8-20 years of growth (Schopmeyer 
(ed.), T75DA Agriculture Handbook 450: Seeds of Woodv 
Plants in thg United States , Washington DC, USA: US 
Government Printing Office, pp. 645-655 (1974)). 35S-LFY 
aspen plants were obtained by AgroJbacterium-mediated 
transformation of stem segments and subsequent 
regeneration of transgenic shoots in tissue culture. 

Hybrid aspen was transformed exactly as 
described by Nilsson et al. fTrangaen. Res. 1:209-220 
20 (1992), which is incorporated herein by reference). 
Levels of LFY RNA expression were similar to those of 
35S-LFY ArabidopBis, as determined by northern blot 
analysis. The number of vegetative leaves varied between 
different regenerating shoots, and those with a higher 
25 number of vegetative leaves formed roots, allowing for 
transfer to the greenhouse. Individual flowers were 
removed either from primary transf ormants that had been 
transferred to the greenhouse, or from catkins collected 
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in spring, 1995, at Carlshem, Ume£, Sweden) from a tree 
whose age was determined by counting the number of annual 
rings in a core extracted with an increment borer at 1.5 
meters above ground level. Flowers were fixed in 
5 formaldehyde/acetic acid/ethanol and destained in ethanol 
before photography . 

The overall phenotype of 35S-LFY aspen was 
similar to that of 35S-LFY Arabidopsis. In wild-type 
plants of both species, flowers normally are formed in 
lateral positions on inflorescence shoots. In aspen, 
these inflorescence shoots, called catkins, arise from 
the leaf axils of adult trees. In both 35S-LFY 
Arabidopais and 35S-LFY aspen, solitary flowers were 
formed instead of shoots in the axils of vegetative 
leaves. Moreover, as in Arabidopsis, the secondary 
shoots of trangenic aspen were more severely affected 
than the primary shoot. 

Regenerating 35S-LFY aspen shoots initially 
produced solitary flowers in the axils of normal leaves. 
20 However, the number of vegetative leaves was limited, and 
the shoot meristem was prematurely consumed in the 
formation of an aberrant terminal flower. Precocious 
flower development was specific to 35S-LFY transf ormants 
and was not observed in non- transgenic controls. 
25 Furthermore, not a single instance of precocious flower 
development has been observed in more than 1,500 other 
lines of transgenic aspen generated with various 
constructs from 1989 to 1995 at the Swedish University of 
Agricultural Sciences. These results demonstrate that a 
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heterologous floral meristem identity gene product is 
active in an angiosperm. 

Although the invention has been described with 
reference to the examples above, it should be understood 
5 that various modifications can be made without departing 
from the spirit of the invention. Accordingly, the 
invention is limited only by the following claims. 
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We claim: 

1. A nucleic acid molecule encoding a 
CAULIFLOWER (CAL) gene product having at least about 70 
percent amino acid identity with amino acids 1 to 160 of 

5 the sequence shown in Figure 5 (SEQ ID NO: 10) or with 
amino acids 1 to 160 of the sequence shown in Figure 6 
(SEQ ID NO: 12) . 

2. The nucleic acid molecule of claim 1, 
wherein said CAL gene product is selected from the group 

10 consisting of AraJbidopsis thaliana CAL having the amino 
acid sequence shown in Figure 5 (SEQ ID NO: 10) and 
Brass ica oleracea CAL having the amino acid sequence 
shown in Figure 6 (SEQ ID NO: 12) . 

3 . A nucleic acid molecule selected from the 
15 group consisting of a nucleic acid molecule having the 

nucleic acid sequence shown in Figure 5 (SEQ ID NO: 9) 
and a nucleic acid molecule having the nucleic acid 
sequence shown in Figure 6 (SEQ ID NO: 11) . 

4 . A nucleic acid molecule encoding a 

20 truncated CAL gene product having at least about 70 

percent amino acid identity with amino acids 1 to 150 of 
the sequence shown in Figure 7 (SEQ ID NO: 14) . 

5. The nucleic acid molecule of claim 4, 
wherein said truncated CAL gene product is Brassica 

25 oleracea var. Jbotxytis CAL having the amino acid sequence 
shown in Figure 7 (SEQ ID NO: 14) . 



6. A nucleic acid molecule having the nucleic 
acid sequence shown in Figure 7 (SEQ ID NO: 13) . 
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7. A nucleotide sequence that hybridizes under 
relatively stringent conditions to a nucleic acid 
molecule selected from the group consisting of : 

the nucleic acid molecule of claim 3 or a 
5 nucleic acid molecule complementary 

thereto; and 
the nucleic acid molecule of claim 6 or a 
nucleic acid molecule complementary 
thereto* 

10 8 . A CAL gene, comprising a CAL gene selected 

from the group consisting of an Arabidopsia thaliana CAL 
gene having the nucleotide sequence shown in Figure 13 
(SEQ ID NO: 20), a BrasBica oleracea CAL gene having the 
nucleotide sequence shown in Figure 14 (SEQ ID NO: 21) 

15 and a Braasica oleracea var. botrytis CAL gene having the 
nucleotide sequence shown in Figure 15 (SEQ ID NO: 22) . 

9. A nucleotide sequence that hybridizes under 
relatively stringent conditions to the CAL gene of 
claim 8, or a complementary sequence thereto. 

20 10. A vector, comprising the nucleic acid 

molecule of claim l. 

11. A vector, comprising the gene of claim 8. 

12. A vector, comprising a nucleic acid 
molecule selected from the group consisting of the 

25 nucleic acid molecule of claim 2 and the nucleic acid 
molecule of claim 3 . 



13. A host cell, comprising the vector of 

claim 10. 
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14. The vector of claim 10, wherein said 
vector is an expression vector. 

15. An expression vector, comprising a nucleic 
acid molecule selected from the group consisting of the 

5 nucleic acid molecule of claim 2 and the nucleic acid 
molecule of claim 3 . 

16. The expression vector of claim 14, further 
comprising a cauliflower mosaic virus 35S promoter. 

17. The expression vector of claim 14, further 
10 comprising an inducible regulatory element . 

18. A kit for converting shoot meristem to 
floral meristem in an angiosperm, comprising the 
expression vector of claim 14. 



19. A kit for promoting early flowering in an 
15 angiosperm, comprising the expression vector of claim 14. 

20. A CAL polypeptide having at least about 70 
percent amino acid identity with amino acids 1 to 160 of 
the sequence shown in Figure 5 (SEQ ID NO: 10) or with 
amino acids 1 to 160 of the sequence shown in Figure 6 

20 (SEQ ID NO: 12) . 



21. The CAL polypeptide of claim 20, wherein 
said CAL polypeptide is Arabidopsis thaliana CAL 
polypeptide having the amino acid sequence shown as amino 
acids 1 to 255 in Figure 5 (SEQ ID NO: 10) . 
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22. The CAL polypeptide of claim 20, wherein 
said CAL polypeptide is Brassica oleracea CAL polypeptide 
having the amino acid sequence shown as amino acids 1 to 
255 in Figure 6 (SEQ ID NO: 12) • 

5 23. An antibody that specifically binds the 

CAL polypeptide of claim 20. 

24. The antibody of claim 23, wherein said 
antibody is a monoclonal antibody. 

25. A truncated Braeeica oleracea var. 

10 botrytis CAL polypeptide having the amino acid sequence 
shown as amino acids 1 to 150 in Figure 7 (SEQ ID NO: 
14) . 

26. An antibody that specifically binds the 
truncated Braasica oleracea var. botrytis CAL polypeptide 

15 of claim 25. 

27. A method of identifying a Brassica having 
a modified CAL allele, comprising detecting a 
polymorphism associated with a CAL locus, said CAL locus 
comprising a modified CAL allele that does not encode an 

20 active CAL gene product. 

28. The method of claim 27, wherein said 
modified CAL allele encodes a truncated CAL gene product. 

29. The method of claim 27, wherein said 
polymorphism is within a CAL gene. 
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30. The method of claim 29, wherein said 
polymorphism is detectable as a restriction fragment 
length polymorphism. 

31. The method of claim 30, wherein said 

5 polymorphism is at nucleotide 451 of the nucleic acid 
sequence shown in Figure 7 (SEQ ID NO: 13) . 
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-81 

* * * ♦ * • 

GAATTCCICG ACCTACGTCA GOGCCCTGAC GTAGCTCGAA G TC T GA GCTC TTCTTTRIAT 

-21 

• * * * • . 

CTCTCTTGTA GTITCTTATT CQ G G STCTTT GTTTTGTTTG GTTCTTITAG AGTAAGAAGT 

* * • • • 

TXCTTAAAAA AGGAXCAAAA ATG GGA AGG GST AGG GTT CAA TIG AAG AGO ATA 

MGRGRVQLKR1> ii 

40 

* * * * • 

GAG AAC AAG ATC AAT AGA CAA GIG ACA TTC TOG AAA AGA AGA CCT GOT 
ENKINRQVTFSKRRAG> 27 

100 

* • • • 

CTT TTG AAG AAA GCT CAT GAG ATC TCT GTT CTC TGT GAT GCT CAA GTT 
LLKKAHEI SVLCDAEV> 43 

160 

* • * • * 

GCT CTT GTT GTC TTC TOC CAT AAG GGA AAA CTC TTC GAA TAG TCC ACT 
ALVVFSHKGKLPEYST> 59 

220 

* • * * * 

GAT TCT TGT AUG GAG AAG ATA CTT GAA CCC TAT GAG AGG TAC TCT TAC 
DSCMEICX LERYERYSY> 75 

* * • # # 

GCC GAA AGA CAG CTT ATT GCA CCT GAG TCC GAC GTC AAT ACA AAC TCC 
AERQLIAPESDVMTHW^ 9i 

280 

* * * * • 

TCS ATG GAG TAT AAC AGG CTT AAG GCT AAG ATT GAG CTT TTG GAG AGA 
SHEYNRLKAK TELLE R> 107 

340 

* * * * 

AAC CAG AGG CAT TOT CTT GGG GAA GAC TTG CAA GCA ATG AGC CCT AAA 
NQRHYLGEDLQAMSPJ> 123 

400 

* * • • • 

GAG CTT CAG AAT CTC GAG CAG CAG CTT GAC ACT GCT CTT AAG CAC ATC 
ELQNLEQQLDTALKHX> 139 

460 

• • • • * 

CCC ACT AGA AAA AAC CAA CCT ATG TAC GAG TCC ATC AAT GAG CTC CAA 
RTRKNQLMYESINELQ> 155 

• * * • * 

AAA AAG GAG AAG GCC ATA CAG GAG CAA AAC AGC ATG CTT TCT AAA CAG 
KKEKAXQEQNSMLSKO 171 



FIG. IA 
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520 

• • * * « 

ATC AAG GAG AOS GAA AAA. ATT CTT AGG GCT GAA CAG GAG CAS TOG GAT 
XKEREKXLRAQQEQWD> 187 

580 

* • » * 

CAG CAG AAC CAA GGC CAC AAT ATG CCT CCC CCT CIG CCA CCG CAG CAG 
QQNQGHNMPPPLPPQQ> 203 

640 

* • * • • 

CAC CAA ATC CAG CAT CCT TAC ATS CIC TCT CAT CAG CCA TCT OCT TTT 
HQIQHPYMLSHQPSPP> 219 

700 

* * • • « 

CIC AAC ATG GGT GGT CIG TAT CAA GAA GAT GAT CCT ATG GCA ATG AGG 
LNMGGLYQEDDPMAMR> 235 

* * • * » 

AAT GAT CIC GAA CTG ACT CTT GAA CCC GTT TAC AAC TGC AAC CXT GGC 
NDLELTLEPVYNCNLG> 251 

760 

* * * * • 

TGC TIC GCC GCA TGA AGC APT TDC ATA TAT ATA TTT GXA ATC GTC AAC 
CFAA*SISIYIFVIVlfc> 267 

820 

* • * • 

AAT AAA AAC ACT TTG CCA CAT ACA TAT AAA TAG TOG CIA GGC TCT TTT 
NKNSLPH TYK*WLGSP> 283 

880 

* • * * • 

CAT CCA ATT AAT ATA TTT TOG CAA ATG TIC GAT GTT CTT ATA TCA TGA 
HPINIFWQMFDVLISS> 299 

940 

* * • * • * 

TAT ATA AAT TAG C AGGCICCTIT CTICTTTTGT AATTTGATAA GTTTATITOC 
Y I M * X> 302 

1000 

* * * • • • 

TTCAATATGG AGCAAAATTG TAATATAXTT GAAGGIC A GA GAGAATGAAC GXGAACITAA 

1060 

* * # * • * 
TAGAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAACCCGACG TAGCICGAGG 



AATIC 

FIG IB 
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TCTTA6AG6A AATA6TTCCT TTAAAAGGGA TAAAA ATG G6A AGG GGT AGG GTT CAG 

M G R G R V Q 7 

25 

• • • • • 

TTG AAG AGG ATA GAA AAC AAG ATC AAT AGA CAA GTG ACA TTC TCG AAA 
LKRIENKINRQVTFSK 23 

85 

♦ • • • • 

AGA AGA GCT GGT CTT ATG AAG AAA GCT CAT GAG ATC TCT GTT CTG TGT 
RRAGLMKKAHEISVLC 39 

115 

• • • « • 

GAT GCT GAA GTT GCG CTT GTT GTC TTC TCC CAT AAG GGG AAA CTC TTT 
DAEVALVVFSHKGKLF 55 

205 

• • • • 

GAA TAC TCC ACT GAT TCT TGT ATG GAG AAG ATA CTT GAA CGC TAT GAG 
EYSTDSCMEKILERYE 71 

• • • • • 

AGA TAC TCT TAC GCC GAG AGA CAG CTT ATA GCA CCT GAG TCC GAC TCC 
RYSYAERQL I APESDS 87 

265 

• • • • • 

AAT ACG AAC TGG TCG ATG GAG TAT AAT AGG CTT AAG GCT AAG ATT GAG 
NTNWSMEYNRLKAK I E 103 

325 

• • • • • 

CTT TTG GAG AGA AAC CAG AGG CAC TAT CTT GGG GAA GAC TTG CAA GCA 
LLERNQRHYLGEDLQA 119 

385 

• * • • • 

ATG AGC CCT AAG GAA CTC CAG AAT CTA GAG CAA CAG CTT GAT ACT GCT 
MSPKELQNIEQQLDTA 135 

CTT AAG CAC ATC CGC TCT AGA AAA AAC CAA CTT AGT TAC GAC TCC ATC 
LKH1RSRKNQLMYDSI 151 

• • • • • 

AAT GAG CTC CAA AGA AAG GAG AAA GCC ATA CAG GAA CAA AAC AGC ATG 
NELQRKEKAIQEQNSM 167 

505 

• • • • • 

CTT TCC AAG CAG ATT AAG GAG AGG GAA AAC GTT CTT AGG GCG CAA CAA 
LSKQ1 KERENVLRAQQ 183 



FIG 2A 
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565 

• • • • • 

6A6 CAA T6G GAC 6A6 CA6 AAC CAT 66C CAT AAT ATG CCT CCG CCT CCA 
EQWDEQNHGHNMPPPP 199 

625 

• • « • • 

CCC CCG CAG CAG CAT CAA ATC CAG CAT CCT TAC ATG CTC TCT CAT CAG 
PPQQHQIQHPYMLSHQ 215 

685 

• • • • 

CCA TCT CCT TTT CTC AAC ATG GGG GGG CTG TAT CAA GAA GAA GAT CAA 
PSPFLNMGGLYQEEDO 231 



ATG GCA ATG AGG AGG AAC GAT CTC GAT CTG TCT CTT GAA CCC 6GT TAT 
MAMRRNDLDLSLEPGY 

715 

• » 

AAC TGC AAT CTC GGC TGC 
N C N L 6 C 



217 



253 



FIG. 2B 
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ATG 66A A6G 66T AG6 GTT CAG TTG AAG AGG ATA 
MGR6RVQLKR 1 

60 

• • • • 

AGA CAA GTG ACA TTC TCG AAA AGA AGA GCT GGT 
RQ VTFSKRRAG 

120 

• • • 

CAT GAG ATC TCT GTT CTG TGT 6AT GCT GAA GTT 
HEISVLCDAEV 



TCC CAT AAG GGG AAA CTC TTT GAA TAC CCC ACT 
SHKGKLFEYPT 



GAG ATA CH GAA CGC TAT 6AG AGA TAC TCT TAC 
EILERYERYSY 



GAA AAC AAG ATC A AT 
E N K I N 



CTT ATG AAG AAA 6CT 
L M K K A 



GCG CTT GTT GTC TTC 
A L V V F 

180 

• » 

GAT TCT TGT ATG GAG 
D S C M E 

210 

GCC 6AG AGA CAG CTT 
A E R 0 L 



16 



32 



48 



60 



80 



ATA GCA CCT GAG TCC GAC TCC AAT ACG AAC TGG 
IAPESDSNTNW 

300 

• • • • 

AG6 CTT AAG GCT AAG ATT GAG CTT TTG GAG AGA 
RLKAKIELLER 



TCG ATG GAG TAT AAT 
S M E Y N 96 



AAC CAG AGG CAC TAT 
N Q R H Y 112 



360 

• • « 

CTT GGG GAA GAC TTG CAA GCA ATG AGC CCT AAG 
LGEDLQAMSPK 



GAG CAA CAG CTT GAT ACT GCT CTT AAG CAC ATC 
EQQLDTALKH I 



CAA CTT ATG TAC GAC TCC ATC AAT 6AG CTC CAA 



GAA CTC CAG AAT CTA 
E L 0 N L 128 

420 

CGC TCT AGA AAA AAC 
R S R K N 144 

480 

AGA AAG GAG AAA GCC 
R K E K A 160 



ATA CAG GAA CAA AAC AGC ATG CTT TCC AAG CAG 
1 QEQNSMLSKQ 

540 

• • • • 

AAC GTT CTT AGG GCG CAA CAA GAG CAA TGG GAC 
NVLRAQQEOWD 



ATT AAG GAG AGG GAA 
I K E R E 176 



GAG CAG AAC CAT GGC 
E Q N H G 192 



FIG. 3A 
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600 

ft • ft ft ft 

CAT AAT AT6 CCT CC6 CCT CCA CCC CC6 CA6 CAG CAT CAA ATC CAG CAT 
HNMPPPPPPOQHQIQH 208 

660 

• ft ft ft ft 

CCT TAC ATG CTC TCT CAT CAG CCA TCT CCT TTT CTC AAC ATG GGA GGG 
PYMLSHQPSPFLNMGG 221 

720 

• ft .ft ft ft 

CTG TAT CAA GAA GAA GAT CAA ATG GCA ATG AGG AGG AAC GAT CTC GAT 
LYQEEDQMAMRRNDLD 210 

ft • ft ft 

CTG TCT CTT GAA CCC GTT TAC AAC TGC AAC CTT 6GC CGT CGC TGC TGA 
LSLEPVYNCNLGRRC* 255 



FIG. 3B 
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ATC GGA AGG GGT AGG GIT GAA TIG AAG AGG ATA GAG AAC AAG 
MGRGRVELKRIENK> 14 

51 

• • * * 

ATC AAT AGA CAA GIG ACA TIC TOG AAA AGA AGA ACT GGT CTT TIG AAG 
INRQVTF SKRRTGL L K> 30 

111 

• • # • * 

AAA OCT CAG GAG ATC TCT GTT CTT TGT GAT GCC GAG GTT TOC CTT ATP 
K A Q E I SVLCDAEVS Ll> 46 

171 

• • • * • 

GTC TIC TOC CAT AAG GGC AAA TIG TIC GAG TAC TOC TCT GAA TCT TOC 
VFSHKGKLFEYSSESO 62 

231 

* * * *\ * 

ATC GAG AAG CTA CTA GAA CGC TAC GAG AGG TAT TCT TAG GCC GAG AGA 
MEKVLERYERYSYA E R> 78 

• * * * * 

OG CTG ATT GCA CCT GAC TCT CAC GIT AAT GCA CAG ACG AAC TOG TCA 
QLIAPDSHVKAQTNWS> 

291 

• * * * 

ATC GAG TOT AGO AGG CTT AAG GCC AAG ATT GAG CTT TIG GAG AGA AAC 
MEYSR1»KAKIELLE RM> 110 

351 

* * • * • 

CAA AGG CAT TAT CTG GGA GAA GAG TTG GAA CCA ATC AGC CTC AAG GAT 
QRHYLGEELEPMSLKD> l ^ 

411 

* * . 

CTC CAA AAT CIG GAG CAG CAG CTT GAG ACT GCT CTT AAG CAC ATT CGC 
LQNLEQQLETALKH I R> 152 

471 

• * * * # 

TOO AGA AAA AAT CAA CTC ATC AAT GAG TOC CIC AAC CAC CTC CAA AGA 
SRKNQ LMNESLNHL QR> 168 

* * # « # 

AAG GAG AAG GAC ATA CAG GAG GAA AAC AGC ATC CTT ACC AAA CAG ATA 
KEKE I Q E E NSMLTK Q I> 184 

531 

• * • * 

AAG GAG AGG GAA AAC ATC CTA AAG ACA AAA CAA ACC CAA TCT GAG CAG 
KEREN I LKTKQTQC EQ> 200 

591 

• • * » • 

CTC AAC CGC AGC GTC GAC GAT CTA CCA CAG CCA CAA CCA TXT CAA CAC 

FIG 5A 
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LNRSVDDVPQ PQPFQH> 216 

651 

• • • • • 

OCC CAT CTT TAC ATG ATC OCT CAT CAG ACT TCT OCT TIC CIA AAT A1G 
PHLYMIAHQTSPFLNK> 232 

711 

* * * * • 

GGT GCT TIG TAC CAA GGA GAA GAC CAA ACG GOG AXG AGG AGG AAC AAT 
GGLYQGEDQTAMRRNK> 248 

* * * * • 

CIG GAT CTG ACT CTT GAA CCC ATT TAC AAT TAC CTT GGC TOT TAC OCC 
LDLTLE P IYNYLGCYA> 262 

GCP TCA — 

A • X> 263 
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ATG 6GA AGG GGT AGG 
M G R G R 

60 

• ♦ 

CGA CAA GTG ACG TTT 
R Q V T F 



CAT GAG ATC TCG ATC 
H E I S I 



TCC CAT AAG GGG AAA 
S H K G K 



AAG GTA CTA GAA CAC 
K V L E H 



GTT GAA ATG AAG AGG ATA GAG AAC AAG ATC AAC 
VEMKRIENKIN 



TCG AAA AGA AGA GCT GGT CTT TTG AAG AAA GCC 
SKR RAGLLKKA 

120 

CTT TGT GAT GCT GAG GTT TCC CTT ATT GTC TTC 
LCDAEVSL IVF 

180 

CTG TTC GAG TAC TCG TCT GAA TCT TGC ATG GAG 
LFEYSSESCME 

TAC GAG AGG TAC TCT TAC GCC GAG AAA CAG CTA 



16 



32 



48 



60 



80 



AAA GTT CCA GAC TCT 
K V P D S 

300 

TAT AGC AGG CTT AAG 
Y S R L K 



CAT TAT CTG GGC GAA 
H Y L G E 



AAT CTG GAG CAG CAG 
N L E 0 Q 



AAA AAT CAA CTA ATG 
K N Q L M 



AAA GAA ATA CTG GAG 
K E I L E 

540 

• « 

AGG GAG AGT ATC CTA 
R E S I L 



CAC GTC AAT GCA CAA ACG AAC TGG TCA GTG GAA 
HVNAQTNWSVE 



GCT AAG 
A K 



GAT TTA 
D L 



ATT GAG CTT 
I E L 

360 

GAA TCA ATC 
E S I 



CTT GAC 
L D 



CAC GAG 
H E 



GAA AAC 
E N 



ACT TCT CTT 
T S L 



TCC CTC AAC 
S L N 



AGC ATG CTT 

S M L 



TTG GAG AGA AAC CAA AGG 
L E R N Q R 



AGC ATA AAG GAG CTA CAG 

S I K E L 0 

420 

AAA CAT ATT CGC TCG AGA 
K H I R S R 

480 

CAC CTC CAA AGA AAG GAG 
H L Q R K E 



GCC AAA CAG ATA AGG GAG 
A K Q 1 R E 



AGG ACA CAT CAA AAC CAA TCA GAG CAG CAA AAC 
RTHQNQSEQQN 



96 

112 

128 

144 

160 
176 

192 
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600 



C6C AGC CAC CAT GTA GCT CCT CAG CCG CAA CCG CAG TTA AAT CCT TAC 
RSHHVAPQPQPQLNPY 

660 

* * • # « 

ATG 6CA TCA TCT CCT TTC CTA AAT ATG G6T GGC ATG TAC CAA GGA GAA 
MASSPFLNMGGMYQGE 

720 

TAT CCA ACG GCG GTG AGG AGG AAC CGT CTC GAT CTG ACT CTT GAA CCC 
YPTAVRRNRLDLTLEP 
• • • 

ATT TAC AAC TGC AAC CTT GGT TAC TTT GCC GCA TGA 
J YNCNLGYFAA* 



208 

221 

240 
251 
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ATG GGA AGG GGT AGG GTT GAA ATG AAG AGG ATA GAG AAC AAG ATC AAC 
MGRGRVEMKR1ENK1N 16 

60 

• ♦ • • • 

AGA CAA GTG ACG TTT TCG AAA AGA AGA GCT GGT CTT TTG AAG AAA GCC 
RQVTFSKRRAGLLKKA 32 

120 

• • * • • 

CAT GAG ATC TCG ATT CTT TGT GAT GCT GAG GTT TCC CTT ATT GTC TTC 
HE1S1LCDAEVSLIVF 48 

180 

• • • • » 

TCC CAT AAG GGG AAA CTG TTC GAG TAC TCG TCT GAA TCT TGC ATG GAG 
SHKGKLFEYSSESCME 64 

240 

AAG GTA CTA GAA CGC TAC GAG AGG TAC TCT TAC GCC GAG AAA CAG CTA 
KVLERYERYSYAEKQL 80 
• » • • 

AAA GCT CCA GAC TCT CAC GTC AAT GCA CAA ACG AAC TGG TCA ATG GAA 
KAPDSHVNAQTNWSME 96 

300 

• • • • • 

TAT AGC AGG CTT AAG GCT AAG ATT GAG CTT TGG GAG AGG AAC CAA AGG 
YSRLKAKIELWERNQR 112 

360 

* * • • • 

CAT TAT CTG GGA GAA GAT TTA 6AA TCA ATC AGC ATA AAG GAG CTA CAG 
HYLGEDLESIS1KELQ 128 

120 

* • • • ft 

AAT CTG GAG CAG CAG CTT GAC ACT TCT CTT AAA CAT ATT CGC TCC AGA 
NLEQQLD'TSLKH I RSR 14* 

180 

* * • * 

AAA AAT CAA CTA ATG CAC TAG T CCCTCA ACCACCTCCA AAGAAAGGAG 
KNULnH X 

AAAGAAATAC TGGAGGAAAA CAGCATGCTT GCCAAACAGA TAAAGGAGAG GGAGAGTATC 

.600 

CTAAGGACAC ATCAAAACCA ATCAGAGCAG CAAAACCGCA GCCACCATGT AGCTCCTCAG 

660 

CCGCAACCGC AGTTAAATCC TTACATGGCA TCATCTC.CTT TCCTAAATAT GGGTGGCATG 

72 2 

TACCAAGGAG AATATCCAAC GGCGGTGAGG AGGAACCGTC TCGATCTGAC TCTTGAACCC 

• • • 

ATTTACAACT GCAACCTTGG TTACTTTGCC GCATGA 

FIG. 7 
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* * * • » 

GAATICCCCG GATCXCCATA TACAIATCAT ACATATATAT AGTATACTAT 
60 

* * « * * 

CTITAGACTG ATTTCTCTAT ACACTATCTT TTAACTTATG TA3CGTTTCA 

120 

* * • • * 

AAACTCAGGA CGTACATGIT TDkAMTTOG TTATATAACC ACGACCATTT 

180 

* * * * • 

CAAGTATATA TCTCATACCA TACCAOATTT AATATAACTT CTATGAAGAA 

240 

* • * * * 

AATACATAAA GTTGGATTAA AATGCAAGTG ACATCTTTTT AGCATAGGTT 

300 

* * * * * 

CATTTCGCAT AGAAGAAATA TATAACTAAA AATGAACTTT AACTTAAATA 

* * * * ♦ 

GMTITACTA TATTACAATT TTOCIOTTA CATQCTCTAA TTPaTPITC 

360 

* * * * # 

TAAAATTAGT ATGATTCTTG TTTTGATGAA ACAATAATAC CGTAAGCAAT 

420 

* * * * * 

AGTIGCTAAA AGATGTCCAA ATATXTATAA ATTACAAAGT AAATCAAATA 

480 

* * * * * 

ACCAAGAAGA CA03TCGAAA ACACCAAATA AGAGAAGAAA TGGAAAAAAC 

540 

* * * * • 
AGAAAGAAAT TITITAACAA GAAAAATCAA TTAGTCCTCA AACOTGAGAT 

600 

* * * • • 

A3TTAAAGTA ATCAACTAAA ACAGGAACAC TTCACTAACA AAGAAATTTG 

* * » • * 

AMSGXGGZC CAAC1T1UAC TTAATIATAT TATTTTCTCT AAGGCTTATG 

660 

* * * • * 

CAATATATGC CTTAAGCAAA TGCCGAATCT UlTlTrmT TTIGmnG 

720 

* * • • * 

GATATTCACT GAAAATAAGG GGTTSTTSCk CACTTCAAGA TCTCAAAAGA 

780 

* * * • + 

GAAAACTATT ACAAOGGAAA TTCATTGTAA AAGAAGTGAT TAAGCAAATT 



840 
• 
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GAGCAAAGGT TCRKXGXGG TTTMTTCRT TAXATCA3TG ACATCAAATT 



GTATATATAT GGTICTTTIA TTTAACAATA 

• » * 

ACTAAATATG TTTGATTGAC GAAAAAAAAT 

960 



ACAXAGCACA TA3CAACTGA TTTTTGTDC 
1020 

GAACACACAA CATIGAAAAA ATCTITCACA 

1080 

AAATTTTGAA TACTTACAAT TOTCTTCTOG 



AICCXGCCIA CAAATCCGTC GACGCAAiaC 



TTCTCAGCTC TACCAAAAAC ATCTATTGCC 

• # 

CTTCACTCTT ACAGCTGAGA ACATTAAATA 

1260 

ACAAAGGGTT CTCACCTTAT TCCAAAAGAA 



1320 

GAGAAATCTT AATAAAAGGA AATTAAAAAT 



1380 

• • * 

GATTTTGTTT CGIAGATCTA CAGGGAAATC 



AAGCTGACAC CTQQGBMMBG ACCAglGG TC 



TTCTCTTCAC GAGftOOTOGA TAATCAAATT 



GTOOGCAGTT TTATTAAAAA A TCA TO GA CC 
1560 

CCAA3GAGAA GTOGACAOGC AAATOCTUUt GAAAGCACIG TSGITITIGC 



1620 
• 



900 

TATATGGATA TAAOGTACAA 
ATATGTATCT TTGATTAACA 

* * 

GATCATCTAC AACTTAAXAA 

AAATACTATT TTK3GCTTTC 

* * 

ATCTIOTCT CTTTOCm A 

1140 

* • 

ATTACACAGT TGTCAATTGG 
1200 

* * 

AAAAGAAAGG TCTA3TICTA 

TAATAAGCAA ATITCATAAA 

* * 

T3U3TGTAAAA TAGGGTAATA 

AGATAI'ITIU GTTCGGTTCA 



TOOGCOGTCA' ATOCAAAGCG 
1440 

GIACAAXGTT ACTTAOCCAT 
1500 

* ♦ 

GTTTAJrmC ATATOTTAA 

CGACATTACT AOGAGATATA 



AAACAAGAGA AACCAGCTIT AGCTTCTOCC TAAAACGACT CTTACOCAAA 

FIG. IOB 
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1680 

TCTCTCCATA AATAAAGATC OTGAGACTCA AACAOUU3TC 

1740 

GGAAAGAAAG AAAAACITIC CTMTXCCtVT CATACCAAAG 

Tcmamc TcroxTGiac ttictimtc gsgsicttpg 



1T1TI!ATAAA 

TC7XGAORCT 
1800 
TTTTGTTIGG 



ticttttaga gtaagaagtt tctiaaaaaa 

I860 

TMGGTTCAA 1TGAAGAGGA TAGAGAACAA 
1920 

TCTOGAAAAG AAGAGCTGGT CTTTIGAAGA 

1980 

• * • 

CICIGTOMG CTCAAGTXGC ICTICTIGIC 

* * * 

CTXGGAAXAC TOCACTGATT CITQGTAACT 



GGAICAAAAA TGGGAAGGGG 

GATCAAIAGA CAAGTGACAT 

* * 

AAGCTOVTOA GATCICTGTT 

TTCTCCCATA AGGGGAAACT 

2040 
TCAACTAATT 



crmcmr 

2100 



AAAAAAATCT TITAATCTGC TACTTTAraT AffnTTTTTC 



TCEATGATTC ATACTCTTTT GTTOTTATAA 
2160 

* * * 

CTIGATTIGT TAXAGGAAAT CTIGGTTIAA 

2220 

• # • 

ATTCATCCTA AAATGTOATG AXATCTIGGT 

2280 



AGGTATCAZA GAGAfflOGGZA 



TTGCAXAAAA CCATCATTAG 



CACATCTCCA TATTATTTAT 



AIAATAAAAT GATAA3TOGT TGATCAXAAA GCXAACCCA ATICTOIGAA 



ATOATCAGTA TCGAGAAGAT A LT1UA AOGC 



2340 

TATGAGAGGT ACICTTAOGC 
2400 



OGAAAGACAG CTTATTGCAC CIGACICCGA CGTCAATCTA Tl ' IUA ATAAA 

TATPICTOCr TnCAAICCAC A3ATATATTA TATCAATCTA TTTGIAOTAT 
2460 

FIG. IOC 
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AAACT 



TGATGAA3TT TATnCTUA AAACTiraDG TACACAGACA AACIGGICGA 
2520 

• » # * 

TGGCGIA3AA CAGGCTIAAG GCXAAGATTO A ULYmUA GAGAAAGCAG 



2580 

* * » # 

AflCTACACAT TTRCACTCAT CACA3TTCTA TCTUSAAAAT CGATDQGGIT 

2640 

* * * * 

CCATITOkAA GTAAGTTLAA ATTCATIGAT GCIATIGAAA TTCAGGCATr 



2700 

* • * * 

ATCTTGGGGA AGACTTQCAA GCAATGAGCC CTAAAGAGCT TCAGAATOIG 



GAGCAGCAGC TTGACACIOC TCTEAAGCAC A TPPTT7V CTA GAAAACTA3T 
2760 

* • * • 

oocncrocr attccgitga aoomcxat mAcmu qgrtacaag 
2820 

* • * • 

TCmSPTATA ATOIGAACAT TSAAATACAX ATOICTA3CT AICAA3A3AT 

2880 

* • * * 

AIATCAG»A TGAAXATCAA THGAIATCT CTMA QGTIG UT1UA ATOT 

2940 

• * • • 

A3GAOTTATC TIGIGCOTT TAAGACTOCA TAITOCTXAA AdAAXGGCT 

3000 

* * • * • 

totiaatgtt gatotoictg tajigcagaac caacxtaott adgagtccat 

* * * » 

CAATCAGCIC CAAAAAAAGG TATOXAAAAC CCCXA IOVAA TGDIIGTCTT 
3060 

* * • * 

AT A fi A G A AAC GXA9AGGAAA, GCTAATEAAC AAT0GTGO0G TTTDGGA ATO 

3120 

* * • * 

ACAGGAGAAG GOOflACAOG AGCAAAACAG CATOCITTCr AAACAGGAAC 



3180 

* » * * 

AOiroiCATC ATTlClCrTT CATCAACATG TTOTOCATTG CATEACICTT 

3240 

~_ * * * * 

ACCPIDCACT CTICIGCTCC ACACTTCCAG CCAAGCTA3A CCTACGATAT 



3300 

* * * * * 

CnCATAICT CXACTEAACT TOGGCA CC A T TAAATAAAAA TAGAAAATCT 

FIG. I0D 
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* * * • * 

TIGCAAATIT GTITGAAATA GCAXAGATCT TCTCIATIGA TIGATAIAAX 
3360 

**<•*• 
CACCAOOCTG TACGIAGATA TOSITIOICC GTrTAGTHT AA Q CI CTCIC 

3420 

* * • • * 

TOQGATTGAA AATATTTTGA AMCnTTGA AATGTTTGTC CCATCATICT 

3480 

* • • • * 

TACTTRGCTC ATMCTATGT ATATGAATAT AGACACTACT CCTAATTA331 

3540 

* * * * * 

AAATCTTAIA A33USTTCMT GCATGAGTGC AACTOTGAAA AXAACTAT1T 

3600 

***** 

CHMOCOTG CmmU GCTICTXCAC TTTGAAAATT GATCATOAIA 

***** 

ATATGGITIG AAATAAATTT GCIGGCA GAT CAAGGAGAGG GAAAAAATTC 

3660 

***** 

TDU3GGCICA ACAGGAGCAG TOQGAICAGC AGAAGCAAGG CCACAATASG 

3720 

***** 

GCKXCCCTC TGCCACCQCA GOGCACCAA ATCCAGCATC CXTACATOCT 

3780 

***** 

CXCIGATCAG CCA3CICCTT TXCTCAACAT GGGGIAACAA AAAATTACTA 

3840 

***** 

ATCAGTCTTA ATTTAAAGCA CATATGTTAT GCAAGCXAGT TACGTTAGGT 

3900 

***** 

GTICTAATTr CAXIGAAGTT AXAGCTGTTA GTGATOGTTA CATGATQCXA 

***** 

QAmTCAAA CTAGAAAACT TTATTTTAAA ACATIATXTT ATTAACCTAG 

3960 

***** 

CTEAATOCAA TGGTCGOCAA ACGAACAAAC TOmSICT GGAAAAATOT 

4020 

***** 

ACA30GAA3G GTIGCGAAAA GCCTAAGTCCG ACITITCITG TOTIGSICT 

4080 

***** 

AOTICTTTAA GTACAATTTT AGTXTGTXAG ATAAATGAAA TTAATA33VIC 



4140 
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TTIGACATXT 



AAACA1MGA 

AASXGCAGrlGB 
4260 
GKXCXGGMC 



cACAAiaac TOAraTircav TrrrccrrrG tictrosgto 

4200 

* * * * 

TTACATATCC ACTTICATAT ATRTOCTATG TA3GATXGXG 

• • * • 

TCTGTR.TCAA GAAGATGATC CAATGGCAAT GAGGAGGAAT 



tcacictiga AcccGrrrac wuctqcaacc TOiuaagii: 

4320 

• * * • 

QOOGCMGAA GCA3TTCCAT ATATA33VTAT TTGTAATDGT CAACAAIAAA 

* * 

AACTRGTTTC CCATCAT^CA. TMCAAA.TAG 



FIG. IOF 
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GCACCIGACT OCGACTOCAA TCTAAACOIA TITCTCICCA 
60 

• • * 

TAAATIAAAT AT3W1T11M T3OTAOTSAT ATATRCPDIT 

120 

* • * 

CTICTSAGAX ATAGACGAAC TO GICB ATOG AGTMDUCEAG 

180 

• * * 
AAGATTGAGC TTTTGGAGAG AAAGCAGAGG TACATTTTCA 

240 

* * • * 

Ararnumys aigaaatabc aaacaggatt A Mcmorr 

* * • * 

GAITALTIAT AAGAAAAIGA TOCATTOAA TAACAAAAAA 



* * ♦ 

ocicTAraa aahteagqca cnacnoGG gaagacrgc 

360 

* • • 

COCBttGGAA C1TXAGAATC TAGAGCAACA GCITOM 3UCT 

420 

* * * 

acmccoctc tagaaaagta tcaatcctcc TOmcrm , 

480 

* * * 

A3BCAACTTA AACACATATT ATTTTATTAT TCAATACATA 

540 

* * * 

GXACA3ATOT GAnTTATIG GTTGGATATA AAAGATCAAT 



agatoehga cttttzaaag aattagtaxa tagag taiua 



IMIZQCSnCG TACGTTTATC CAGAACCAAC TTATCTAOGA CTCCA3CAAT 
640 

• * * 

GAGCIOCAAA GAAAGCTATO TATAAACCCT ATCAAATTGA OGTTTACAIA 

720 

* * * 

GAMAACIGC GICTAAGAAT CdATAGGGG AGCTAACAAT CCT 

780 



T13UUCTTA1A 



CTOTA3TAAA 



TTCATCATTT 

AAAAATOCAT 
300 

A3GCATOGAX 
AAGCAATOAG 



GCICTTAAGC 



ACT&ACAIGT 

TMATGAAIBl 

CACGTCGATT 
600 

1TAGTCAA3G 




TGGAAA3GAC AGGAGAAAGC CATACAQGAA CAAAACAGCA TCCTTTOCAA 

840 

FIG. IIA 
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GCAQGIOCCA TTIGTCATnv TXTTTMMC GTCAAAATOT TTICTATTOT 



AffEACIGTTA 

CTACCTACGA 
960 

CCACTATAAC 



ATATACTCGT 



GSnCITTIC 
TMAIK7IATA 

GTATGGTATA 

ATOAAGGAGA 
1260 
GCAGAACCAT 

AATCCAGCAT 
TOGGGTAGTT 
AIACAAGATA 

ATOCTEAGAT 

raTAATEAAC 
1560 
AXATTITXtjG 

TOGUAGCAGG 



GCTTOCACTG TTCTACICCA CACTTCAAGC 

* * • 

CTACGAGATT CTGCACATAT TICTOCACTT 

• • * 

TAAAATATAG ATAAAATATC ATTTTTATAG 

1020 

CAGCCAGTAC GTAGTTOGGT ATITGCCCGT 



1080 

CGGATTGAAA ATMTC™— -AQJLTACCT 

1140 

TCTATTEAGA AGTCGTGGCT TTCAAAATTG 



* * 

ACTTGGTAAC AAACTGGTGT GTGAAATIGA 

* * • 

GGGAAAACCT TCTTAGGGCG CAACAAGAGC 



* • 

GGCCATATAT GO CICCGOCT CCACCCCCGC 



1320 

CCTTACATGC TCTCTCATCA GCCATCTCCT 
1380 

• » • 

AAAAATTCGT TQCXCTOCT TTCAAGTCAT 

1440 

* • * 

GTIAGGTGTT ATAAGTCCAG TGAGTEAGGT 



TGTGTTAGTC 
1500 

CTCTAGATIG TGAATTACAA GTACEAAGAT TITTCAGITA 



900 

CAAGCTATAC 



AGCTTCGGCA 

TCTATGATTC 

TEAGTTTIAA 

TTGATOCEAT 

ATGATGAXAT 
1200 
AACTICTCAG 



AATGGGACGA 



AGCAGCATCA 



TTOCICAACA 



ATGTGTATAT 



GTATTGATCA TCAATCAAAT GGTCGTAAAA AAACAGACTT 



GAAAGTAGAT GGAATGGCTG CTAAAACTCT AAGAAACCTT 



1620 

# * ♦ • 

TCGEATTEAT TCTTOTTCAA ATIBAACTTG AQGTAGTEAG 



FIG. IIB 

SUBSTITUTE SHEET (RULE 26) 



WO 97/27287 



25/ A* 



PCT/US96/01041 



1680 

« » • • * 

ATAAATAAAC TMCTTIGAT ATQGCCTTTA COATITCAC TACAAAACAT 

1740 

* » • • * 

GXQOTTTT CAGCAOCTAT GTJUGA13ATT TGTAAGC1M ATCATGTGCA 

1800 

* * * * * 

TATCAATGTA AATGCAQGGG GCTGTATCAA GAAGAAGATC AAATOGCAAT 

* • * * * 

GRGGAGGAAC GATCTCGATC TSICICTIGA ACCCOGTTAC AACTGCAACC 

1860 
* 

T TO GQDSTDS OCOCT 

FIG. IIC 



SUBSTITUTE SHEET (RULE 26) 



WO 97/27287 



PCT/US96/01041 



GAGCICTTCT 
60 

TIAGAGGAAA 
TCAGTTCAAG 



AAAGAAGAGC 



GATOCTC5AAG 



ATACCCCACT 



ocxsuusamtj 

360 

tcimgtaxa 

AATTCTGTGA 
•DUriCTIAOG 
AAACCAATTT 

TAGTGAIATA 

TOGATOGAGT 
660 

CCAGAGGTAC 
AGGAIRRKXG 



26/ U 
• * * 

TTRTATCrCT TCTXCTUSIT TCITOTITOG TTTOGITCIC 



* * # 

rnysncciTT aaaagggata aaaatgqgaa GGGSTA GGST 
120 

• • # 

AGGATAGAAA ACAAGATCAA TJUGACAACTG ACATTCTOGA 

180 



TGtTICTEATG AAGAAAGCTC ATGAGATCIC TCTICTCTg 

240 

» • * 

TTCCQCITCT TGICITCICC CAXAAGQGGA AACTCTTIGA 



300 

* * • • 

G&TXCTXGGT AACTTXCZCA TTTAAOAAAC AAAA TAC 

* * * • 

TATTTTACAT GATCATITAC TTOTTTIACA CAGXATAXAC 

* • • • 

TAATATOATC ATAAATTGTT GATGAXAAGA AGCOQCCCT 

420 

* * # 

ATIGAACAGT ATOGAGGAGA TACTTGAAGG CTATGAGAGA 

480 

* * * * 

CCGAGAGACA GCTEATAGCA CCTGAGICCG ACTCCAATCT 



540 

CTCTCCATTA ACTTATATAA ATTAAATATT ATITCAGTAT 

600 

• * • » 

TACTTATCTG TATTAAACTT GTGAGAXAXA GACGAACIG6 

• • * » 

ATAATAOOCT TAAQGCTAAG AXTGAGCTTT TOCAGAGAAA 



* • * 

ATTTTCAnC ATCATTTATA TATATGATGA AATATCAAAC 

720 



TTACTtAAAA ATGCATGATT ACTTATAAAA AAA3GA30CA 
780 

* * • 

TTIAAATAAC AAAAAAATOC ATCGATGCTC TATTGAAATT TAGGCACTAT 

840 

FIG. 12 A 

SUBSTITUTE SHEET (RULE 26) 



WO 97/27287 



PCT/US96/01041 



27/ U 

CnOGGGSAC ACTIGCAAGC AATGAGCCCT AAGGAACTCC AGAA3CBICA 



GCAACAGCTT 



TOCTCCTATT 



960 

TATTATTCAA 



GGAOTIGAAA 



CCAACTIATG 

ACCCTATCAA 

AGGGGAGCIA 
1260 
AGGAACAAAA 



ATTTCGTCAA 



AOC3UCACTTC 



GATCCrATTT 



TCATGAIATC 



GATACTOCTC TIAAGCACAT 
TCTTTAATEA ACATGTATAC 

ATACATATAT ATAAATAGIA 

1020 
» 

AGATCAATCA CGTOGA3TAG 
1080 

AGAGTATOAT TAGTCAATCT 
TACGACTCCA TCAATGAGCT 
ATTGACGTTT ACATAGAATA 



AAAATCGTGC CCTITTCGAA 



CAGCATQCTT TCCAAGCAGG 

1320 
* 

AATGrrmcr attoeagatc 

1380 

* 

AAGCCAAGCT ATACCXACCT 



ATATCTATAT CTOTITAGAA 



GTATOGTATA AGTTCGTAAC 



AACITCTCAG ATEAAGGAGA GGGAAAADGT TCTTAGGGCG CAACAAGAGC 
1560 



CCGCTCTAGA 
AACTTAAACA 

CATATCTGAT 

AATGTATUAC 

AATGGATOGT 
1140 
CCAAAGAAAG 



ACTGQGTOTA 



A3CACAGGAG 



TQCCAOTGT 



tctogcttc 



AOGACZAC — 
1440 



GTCGTCGCT 



AAACIGGTC 



900 

AAACTATCAA 

'lTlATlUGTT 
TITTTAAAGA 
TEATOCAGAA 



GTATGTAXAA 
U00 
AGAATCCEAT 

AAAGCCA33kC 



CAOTAITITP 

CACT3TTCTC 

-CCTACATTT 

TGAAAATIGA 
1500 
GTGAAATTCA 



AATOGSACGA GCAGAACCAT GGCCATAATA TtXLTJ CULX; TCCACCCCCG 
1620 

CAGCAGCATC AAATCCAGCA TCCTTACATG CTOICICATC AGCCATCTCC 

FIG I2B 

SUBSTITUTE SHEET (RULE 26) 



WO 97/27287 



PCT7US96/01041 



28/U 



TTTICTCAAC ATGGGGTAGT 

• * 

ATATOICTIA TATRXACAAG 

AGTICT3TTA GTGATGCTTA 

• * 

GUTTnOT GT31TATATTT 

I860 

AAAAGWUCA GACITAXMT 

1920 

• * 

AACTCTAAGA AACCTTTGGG 



AACTTCAGGT AC7TTAGAIAA 



ATTTCACTAC AAAACATGTG 



TAAGCTAXAT CATGTGCATA 



1680 

TAAAAATPCG TTCCTCTTAC TTTCAAGTAC 



1740 

AIAGTTAGGT GTXKBJU5XC CACTGACTTA 

1800 

* * * 

GATGTCIAAA TTGTCAAATA CAAGTACTAA 



AAA CUTATIA ATCATCAATC AAATC3GTCGT 



TTTGGGAAAA GTAGATOGAA TOOCTGCTAA 



AGCAGGTCGT TTTTAnCTT GTTCAAATIA 
1980 

ATAAACTATC TTIGATATOG GCCTTIACCA 
2040 

ATATTTTCAG CACCTATCTA GATAATTTTG 

2100 

• # • 

TCAATCTAAA TCTAGAGQGC TGTATCAAGA 



AGAAGA1CAA ATOGCAATOA GGAGGAACGA TCT OGKTCIG TCTCTTGAAC 
2160 

OCSrmCAA CTGCAACCTT GQ CCGTC GCT GCTGA 



FIG. I2C 



SUBSTITUTE SHEET (RULE 26) 



WO 97/27287 



PCT/US96/01041 



GGA3CCCTCC 
60 

CAGATTCTTT 
TIQGTCAGOG 



29 /U 

* • • 

GGAAGCCTni GATCAATGGT AGTIT7IGGTT ATTTTAAGAT 



"nCATCACTA 
ITGIAGATIG 

dTTOCTICA. 

AIGGACTATT 
360 

AAGAGTA3TT 

AAAACTOGAT 

CITAGTTAAC 

TOTCATCCGA 

TAACAGATCA 
AATTCATOCA 

T^CAACACTC 

ACGATTCAAA 



* * • 

TCGAAATCCA GTAACATAGT CIGGGAATAT GATTTGCTTG 

120 

* * * 

raCIGCTIC TCOGTTCGTC ATTTCOGATT TEMCTACTT 

180 

* • * 

TGATAATTTC TTCTTTCTTA CGTCGAGATG XglCTOCTTr 

240 



* * * 

AATTICTCAA TSITOCTTTG ATCA3MGAC CATITGATIT 

300 

• # # 

TIGATCGATC CAATTTCTTC GGGAGATAAA TMGGTDUJkA 



ATTTITOGAA AAXACAGGAG AAAAAAATTC TIAAGAAIAA 

* * • 

ATAGTOAOCA TCAATTTIGT TOlTlTlTiA AAAAGAAAAA 

420 

* * * 

TCGATTOGAT GACACATIGA AATTAACATT CAAA2AGCAT 

480 

* * * 

AGATATIGCA TCCACCATAT AATAAAA3AT CATAATTA3G 

540 

* * • 

GGTrTOTTTT GGTCAAAATG TTOPnTAAT CACAATTTAA 

600 

* * * * 

TTIACCAArr TSrnTTTGA TAATTTATDC CAACTTAGEA 

* * * • 

AAAACTIGAA AAATATAGAT GTCTAATATG TTGACGGATA 

» • * 

AAAACAA3AT ACTCAAAAAA AAAAAAAATT GAAAGOGOCA 



720 

* • * 

CA3ATATGCT AAATTTTAAT AATGGACAAA GGAGGAAGTA 

780 



CTOCATATOT ACGAAAAGTO TTGATAATCG AGAGCAGOOG AXAGIGTCGC 

840 

FIG. I3A 

SUBSTITUTE SHEET (RULE 26) 



WO 97/27287 



PCT/US96/01041 



30 / U 

CAAGGGCACG AGLTITACAT TCITTPWGTT TOCTCTAAAT GTICTTCTTT 

900 

• * • * * 

GCTACTITIA ATTOCTTIM TIGCTTCCTT CTTMCTCCA OOTUAXAAA 

* * * • * 

TQGGGffMCC ATTTTCTCTC GTATCTTMT CCGATCTTTO GA3CTATGTA 

960 

* # * * • 

OGTACTACAT GAATAAATCG TCTTCAATAA (7ITATEATCA TTTQGTCIGC 

1020 

• • * * * 

TTAAACTGAT CATGGTGIAT TAATCTATAA TACGTAGTTC TCTTAATnA 

1080 

* * • * # 

TTCCCTAGAA TTCCATCAAA GACAAATTTT AGCAAAAAGA AAAGTTCAOT 

1140 

* * * * • 

A3MAATITC CTTAGTAGTA CAAAAAAAAA CSffllGGZA ATTICTATTT 

1200 

• • * * * 

TOGA3MTTC CTTOATIAAC CCAAACTTCA AAATTAAITT TCnCIGCIG 

***** 

TA3CTTTATA TCCAACGTCA AATCTATTCA CTCAACAAAA TACACAGTIG 

1260 

• * * ' * * 
TCAATTCAAG TICAACICTA CCAAGAAACA TCTA1RXCTA CTTCACICTT 

1320 

***** 

CTTACCGCCG AGCAATTAAA ACCXCXATAA CTACTTGGTT ACATTATTAC 

1280 

***** 

A3TITIATTT ACAAAAAATA TATATCAACA ACCAATAATA TAGTTAGAAA 

1440 

***** 

ATOAAAGAAA ATTATTTAAG AAATATCOGC CCTCAATOCA AATCGAATGC 

1500 

***** 

QYCACTXOGG GAAGCTCTGA AGTCICTGGT CTCTGCATAT TTCACTIGTC 

***** 

TMCTAAOCC ATTnCACGT CACTAGACGT CGATAATCAA TTATTGTTAT 

1560 

• * *. * * 

TOTTTTATC aatgttocac ttatigaaaa ttatatacga gaaaacatag 

1620 

***** 

ACTOGACATT AOGCAATSGA AGTCTAATCA GACCAATCAG AACTCGACAA 



FIG 138 

SUBSTITUTE SHEET (RULE 26) 



WO 97/27287 



PCT/US96/01041 



31 / U 



1680 

• * * • • 

CAC&XGCSU3 AAACCAACTC TGGTTTMTT CCTTCCCTAA TACCAAGTTA 

1740 

• * * * * 

TMNwricrr tcaaaccxsct atticcaaaa tatctcttct tiaaaiaaag 

1800 

• * * * * 

AGTGAAAGAA GCACTCTTTC ACATTACCAT CATIAGAAAA CTTTCCEAAT 

• * * * * 

TAGATCAAGA TCGTCGTTAT CTCICTTOTT TTTTCT1CR T ATAMTTMT 

I860 

• * * * * 

TATIT^AAGA GAAATOGGAA GGOGTAGQGT TGAAl'lUAAG AGGATAGAGA 

1920 

• » * # * 

ACAAGATCAA TAGACAACTC AC ATVL TC G A AAAGAAGAAC TOG1CTTTTC 

1980 

• • * • * 

AAGAAAGCTC AGGAGATCTC TGTICITTGT GATQCC GA PG TTTCCCTTAT 

2040 

• * * * • 

TOICITCTCC CAIAAGGQCA AATTGTTCGA GTAITCCICT GAATCTTOGT 

2100 

***** 

AATIGCTTAA TTOCTICnT TTTTAATGTT A 'XTITIM TC TOOCTrCGIT 

• * * * * 

TOCCC IA ACT A gn U SICITT GTTCTACTIA AGGCATATIT TCIGTGTCTT 

2160 

w * * * * 

CTATCCTATT ATCTGTCnT GCTCAAAATT TGCCACTGAT TTGGTATCTA 
2220 

***** 

nXACTTGGS ATCTACGAAC TGATTGTGTT GGTCATATCA TTAGTTEATT 

2280 

* * * * * 

1TTATCAATA ATTTATTATA TATCAAAGAA AATGAAA3TT TTTAGGACTT 

2340 

***** 

TTAGTGAACC CTACAATACG ATCTACTTAA TTATAGTGGC, ATGGATTTCT 

2400 

***** 

AAGAAATCTT CAGCATCTTC TITAATCTGG AAATCTACAT TTTGCTTCAA 



GXCMGZTXA GTATATTAGG TACAGAAAGA ACGGATCTTT ATOGTCTAGA 
2460 



FIG. I3C 

SUBSTITUTE SHEET (RULE 26) 



WO 97/27287 



PCT/US96/01041 



32 /U 

cmcGGTrrr tgctttiagg aaagciaiac TTrrccraA atatcttiaa 



ctigcatxtt 



ACCAATAATC 



TAATAATOGT 



ATCTAAGTAT 



Gncrrrrr 

2760 

caacagtticc 

TQGGGTTAGT 



TOCATAATAG 



T3LACCCTACA 

GTIAATTACG 
3060 
TCNGTATGCA 

AGCTTAAIAT 



2520 

* • * • 

ATGAACACAC ACACACA3AT ATATATATAT ATATEACTAT 

25B0 

* * • « 

TTAATEAACT TIAGAAAGAA ACICTIOITT TTTTCCCATr 

2640 

* * • * 

TEATAGCTAG GTATAGAGAA ACTGGAAATA AGTATOIGAC 

2700 

* * # . 

GGGGAGICIT TGAGCXCXGG GGATTAATGT AAAACAGATC 

* • * * 

TTCTAAACAG TTCCICCGTA CTSATQGTCA AACTTAACTT 

* • * • 

TITTAAACTT TUkTACGGTG CTIGAAXACG TCTTCGGGTG 

2820 

* * ♦ • 

GGCTCAACTG GTTTATmT TTITAAAAAT GGTAGAAATC 

2880 

* * # • 

CTWQCTAGOG TTTAGGCACA AAACIAGAGA TCA3CTTEAT 

2940 

* * • » 

AAAGGAAGAA ACTAATGTTT AATCACATAG ATTAATTAGA 



3000 

* * # * 

TAATCAGATG CTATATGTTA TCACATATTT TCGGI G AATC 

* • • * 

TTTGAAACAA GTGGCCTCTT GTCCXAGCTG' AXAAGATAGT 

* * • * 

ATTATATTCG TGGTIGAATC CAAACTAATT CTAACTCGTA 

3120 

* • * • 



TTOEAGCATG GAGAAGGTAC TAGAA CGCTA CGAGAGGTAT 
3180 

* • * * 

TCTTAOCSOCG AGAGACAGCT GATTCCACCT GACICTCACG TTAATCTATG 



3240 



TPTAATC5IC TCCATCATAT ATTTGTOIAT ATHTGAATC TTOCATCIGT 

3300 

* * • * 

TOAACATAG CATATAACTG ATEATIGGCT TTCATGTTCG AAATEAATTC 

FIG. I3D 

SUBSTITUTE SHEET (RULE 26) 



WO 97/27287 



PCT/US96/01041 



33 /U 

• • • • 

TGAAGGCACA GAGGAACTCG TCAATOGAGT ATAGGAGGCX T»AGGOCAAG 
3360 

• * • * . 

A3TCAGCTIT TGGAGAGAAA CCAAAOCIAC ATJU7IACATT TWkATTIATr 

3420 

***** 
GrERGTAGTTA AAIATICAGG AATAACAGAA GAGAGAATGT TCTTAATEAA 

3480 

***** 

CXAAA3CATC AXAGGCATIA TCTOGGAGAA GAG1TGGAAC CAATGAGCCT 

3540 

***** 

CAAGGATCIC CAAAATCIOG AGCAGCAGCT TCAGACIGCT CTTAAGCACA 

3600 

* * * * * 

TICGCTOCAG AAAAGTCTOT AAATAT&XCC CACACTC1M CTCTATOCAT 

***** 
AACTAACrrr GACTTTCICT GGATGTATTA CAXAHSUnCA AATATTOTAT 

3660 

* • # * * 

AGAGATXGTC TCATATAAAT AAATAATTTT TOUXT1T1 T GTMGCAGAA 

3720 

***** 

TCAACICATG AAXGAGICOC TCAADCACCT CGAAAGAAAG CTAGC33VACT 

3780 

***** 

TAAAACCATT TTATCTOTCA ACXCCIGTGT G13VIAGAGIC ATGACTXATA 

3840 

***** 

TOTIAGAGAT TAATAAA13kA ATAACATAEA GC3TTATAIAT 

3900 

* • * * * 

AATTCAGCTT AAXA3MTAT TAATTACTAG ATCXAXAZAT ACTIAIA1»G 

***** 

AICAXA3aAA AAGAGAAATT GACAA3GGTO TCATHTIGT GGAAAIGACA 

3960 

***** 

OGAGAAGGAG A3ACAGGAGG AAAACAGCAT GCTnkttAAAXAGGTCAJTCA 

4020 

* * * * * 

TTCiTrrnG catticiaac tcitocacta ttxacaattc cacigttcaa 

4080 

* * * * * 

C3CCACITCA ATCrCTACCT TAACGTACCA TCICTCCACT TICOGCCOCA 

FIG. I3E m ° 

SUBSTITUTE SHEET (RULE 26) 



WO 97/27287 



PCT/US96/01041 



ACICTTTGA 



Gncicicoc 

4260 

GTCTATGTGT 

TTAAGTTTIA 



AAATTACACA TACATT3U3CT AGATAGCTAC GAOCXAXAUT 
4320 

* » • 

TTCAGAAGTA AGAAAACG7IA CGATGAAACT ACTKSATTAA 

4380 

* * • 

GMICMMM* TAAATGAAAA AA1ATCACAA TAGTAMACC TTCACGACGC 

4440 

* * • * 
1SAAATTCGC TXMCATTIT GCAGATTIAA TDOTMTTT OOOTITGTr 

4500 

* * ♦ 

TCAAAAIMC ATATEACAAA AAAAACTAIA AGAATAAAAA ATTGAMTIC 



4560 

GMGATCCTT 

ASAGGGAAAA 



TCACATOATC 



ACTATITCTT 



GBlAAAftGAA TTCATAOTTA GOTCITTTG ATTGGXATAA 

4200 

* * * * 

AOCIGCACGT ATMGTMGC TTTCTCOGTT T MTOTO M 

* • * • 

AGATITCAAC TTCAACTTCA ACTGflCTICT CATAATCATA 



♦ * • 

TSCAAA33VGC TGATTAGTTC CAAA3GGG&A TCTATATAAC 



ATRTCATITT CTIGGOGTOT GTC31ATCGGTA TRGA3AAAGG 



4620 

* * * 

CATCCTAAAG ACAAAACAAA CCCAATGTCA GCAGCTGAAC 

4680 

* • # * 

AOGATGTACC ACAGCCACAA CCATTTCAAC ACCCCCATCT 

4740 

* • • 

GCTCMCAGA CTTCICCTTT CCTAAAIATC GGGTAAOQGC 

4800 

* * • • 

ATlTriTIAA GTrCTTTITT CTOOCAiaUl TOTCAAATTC 

* * » * 

TCAAGTGTT3 TCAGTCAGTC ATATAGGCAA TGATRGTGAA 



TCAEATAEAG 
4860 

TKACnCAT ATATW3C3CTT TCTCTEADGT A TOGOSHA G AD G TT GA TO G 
4920 

TATOCATGCA TATTATTCTDl TTOIGATCTT TAATTTGCTA TRIATCATIG 

FIG. I3F 

SUBSTITUTE SHEET (RULE 26) 
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35 /U 
4980 

• • * # 

TAATTTCAGT GCTXTCttOC AAGGAGAAGA CCAAACGGCG ATGAGGAGGA 

50*0 



ACAA3CTGGA TCTGACTCTT GAACCCATTT ACAATEACCT 
* * * 

gccgctigaa tiubacxacat cgatciaiat Ok McicrrT 



AAGATCGATC CTCIATICAT GATCTATATT AAACACCGGT 
5160 

* * • 

mmUiTA TGXCCTXAXA TCAIATCAAC AICATCAAGC 

5220 

* # * 

TTCAAIATAX CTTGTATTTC GGGGAGGAAX GAATAAATGT 

52B0 



GACIGAGAGA GCTAGAAAGA ATTGTTCTTC AAACCTTTTC 

5340 

* • * * 

TCAICGTEAC ATTCTAATTT GATTTCTTTC ACACCCCAAA 



• * # 

TfcCGAATTTA GTCTTIGATC AHTCAACTT TOCTIGGXCA 

• * • • 

GAGCCTXAGA AGCTAAATTT TGAATTCAAA ATAGAAATAA 

5460 

• * • 

AACCTCACAT TCGGTTTCTT CTOCATTTGC TTCRTGTA GG 

5520 

• * • 

GAXCGGAAAT GAGAATTATT GGGCCCTTGT GGQCTICAIA 

S5B0 

• • * 

CKTIGTmA GOCCATAATA CTIGGCATTT TIGCCAAAGA 

5640 



TAAAAGAAAT COGAGAAGAA AAGAAAAATA GTEAGTCGCQG CAATOGAGGA 

5700 

• * * 

TCXATGGAAG AGGGCAAAAT CGTTCGCAGA AGAAGCGGC 



IGGCTGTEAC 
5100 
AAAATAATAT 

TAATTAATAT 



CTTTITDO A 



AATATTTGTG 



TA33VTTGATC 

ATAITIIjTAA 
5400 
AAGTAAATCA 



AAATGTTGGG 



TOQCTGAXAC 



ATTATTAGIT 



AGAAACXCTA 



AAGAAGTL'IV 



* * * 

AGADGAIAAC ACAATCATOC T CC GC GA CCT TOGTCAATCT CCTCACCGAG 



5760 



FIG I3G 

SUBSTITUTE SHEET (RULE 26) 
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TCIRGA331AX 
60 

A3ATCICQ G C 



CAACCAATTA 



AAACACTTTA 



CIAIATGTGA 



AGAGATOQCT 



360 

1GCACACAGA 



A C C QG A CICU 



TOXCMGGGS 



GGCICTGATC 

GGGA3GGAGC 
660 



GGATIGGOGA 



. 36/U . 

CTTCTCAAGA AGGATITAGA ATQGCATAAT 



ATCTGAAACC AT3OTMCAA TTTATTCATG 
120 

AAAATAATCA GTGCATATGA TITCATAAGT 
180 

* » * 

U'XALUXJUATC ATOGTGCGAA ACAAGTOGAG 

240 



TCCTTAQGCC ACACGGCATC TAA1CTGATA 

* * * 

TCTGAGATAT GCAAGCAAGG TCACACGACC 



TAGGCCACAC GQCAAGCIAT GATGCATTAA 



A3GA3GCAAC AATGTGATCT A3CAAGQG 



420 

CTSGAC GCGAG CTGGCICTOG TOGGRTOCGA 



480 
* 



TCIGCTICCT A3GGGGCTCG e GAGClUCXT 

540 

CTOATOGGGA TTACAAGCIG GTTGATCAGG 

* • 

OGAACQGAAC CTCAGGTTCT CTAGGATCAG 



TCMCGCTTO CTCACGRGCT GGAAGGOGA 



* • 

TOGGGATTAG GTTAAAGTCG COGGCTAGGT 

720 



TTTTAGCTTA GATIGCAGAG AACAA TCGTG 
780 

* * 

GTTGTMTTA GAAGATTGAA GATTGAATAG TTCTGTIGTTT 



OC3UUU3GCZC 



A^raAGGATG 



AATGCTAGL7T 

CAADGATCCT 
300 

ATTCATAIAT 



GOCACACGGC 



—- CICGftGC 



GCTGAAOGGG 



CCn^TCGGGT 

AACACGAGCT 
600 

GAACACCTXA 
CTAGGAGGAA 



TAGGTTTAAG 



CTGATAACAT 



TATTAACATA 



840 



FIG. I4A 
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WO 97/27287 PCT/US96/01041 



37 / U 



ACA3GAMT AAAGAT TOCRCGRCTT TGdACUGT TCI KHOCP l 

900 

* • * 

GTTAGGTTAA GOGACTTAAG 



♦ • * 

CAAAGTAGftG TSMTCGCAT TAACTCTTCA 



* * * 

jysngysMusr grgtdcmixc tsacaagcig 



GJUCTTTGGAX CRGMGQIC CAUMBHDtfj 



GBUSIGCCCA CGAAGACTCT 
960 

TTAGAGCTIC ACTAACACTT 
1020 

* * • * * 

TMAAOfflAG AO COA TRTA ATRCAAAACT ADUJMTGA CMAAATTT 

1080 

* * * • • 

GMTO1CTAC ACCAACTCCT TTWUSOkAGA CAGGTCCCGA GACCQGAGTG 



GinCZTOT TCAGCTC- 



FIG. I4B 
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• • • • • 

AAOdTTAGG GXTTTAGQGX TTTTCATTCC AAGATRAGS GTTTICATAA 
60 

• * • * * 

TTCAGAXCAG AACAATCAAT CAACATGTIC TAATOGAATC GATTXCAATC 

120 

• • w * ft 

TAGTCATTAT AAGATCATCA GTTTIMGTT ATOCCAATTT TTAOGATTXA 

1B0 

• • • • • 

TCAAGATCAX TGGATTTCCA TAATAATGGA TOGGGTTTT AGGCTTTG&T 

240 

• * • * » 

CATMCTIT TIAGATTAAT OQGTATACTT TXGCTXGXBG GSITSAAAQC 

300 

• * • • • 

QGACCAGCAA AGAGAAGGGA TGAACCICGA GCTCCACACC GACAGATOOG 

• » * * • 

AOCXGQCICT OSlCOCa aPC CMCIGAACG GGAOZXaCG LU ' lL ' lULTXi: 

360 

• * • • • 

CZ&TCQQGrr CGCGA GCTOC TTOCTATOCG GTTTQCAAGC GGCTGATOOG 

420 

• • • • * 

GATTGOGAGC TGGTTGATCG GGAACACGAG CTGQCTOICA TCOGAACOGA 

480 

• • * * • 

AGCXGAOGIC GTCEAGGATC AGGAACACCT TAGGGATOGA GCK2A3GGST 

540 

• • « * « 

TGCXGAGGAG CXGGAACGCG AGCTAGGACA AATTAUA.Tr OGICOQG&CT 

600 

• • • • • 

AflGTTAAAGT CGOCGGCTAG GTTAGGTTTA AGGGATIOGC GATTTTAQCT 

ft • • • • 

TAGATTGCAG AGAACAATCC TCCTGATAAC GTGTTCTAAA ACAAACOCTT 

660 

• • • • • 

TTAGAAACTG AA3GTITA3G TCTATTATIA. ATCMTkATAT GGGTXTTTT* 

720 

• * * * • 

— — T ACAGTGCGAG AATGATAGAC TOQCAXAGCC AATGAACTCC 



780 

* * * ft * 

AGTCMSACCA. AOTAGAAGTC GACAGCAAAA CCIAGTAAAC TACTCTTC3TT 

840 
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39 fkk 

TBLTCCITCT CCAAAAGCA3 CTmfiGXTT CCCXGAAACC GCTOOTCCA 



AAACMCTIC TCCTTAAA'Bl AAGAAAGACT 



TCAGAAGGGA AAGAAGAAAA ACTITCCTAA 
960 

TCICTCXATT AXAGTTTATA TTTCTIMro 
1020 

CTrrrroGAc ncrrrnoA taacttatat 

1080 

* * * 

GQG1MGGTT GAAATGAAGA GGAIAGAGAA 



OCTTXTCGAA AAGWUSAGCT Gfa ' lUTlTllA 



ARCTRGIG ATGCTCAGGT 



ACICTTOGAS TRCXCGICIG AATCTTOGTA 
1260 



AMGTTTTAG TC1UXT1TC TTOQOCCIAA 
1320 

TEAGGCCATT TCTIGGTATC ITCTTftTOIT 

1380 

TTOCTRGTT AATTACITOG ATCTACG&AT 

* * • 

taaaocaita tjuxatatit gctbumca 

• * • 

QGCAIAATAA GGTCTTATCT GAAGIGAAAG 



MTMGMM GCTTAACCCT AGATCAAGAT CEACTICIAC TGGTCGCGAC 



1560 
* 



900 

• * 

CTXTCACATT GTTATBOCA 



TTAGATCGAC CTTGICGTBL 



GGGCnXSTOT GGTIGCTTC 
* 

ATTCTACGAG AAATOGGAAS 

* * 

CAAGATCAAC AGACAAGTOA 

1140 

AGAAAGCCCA TGAGATCTCG 
1200 



GICTICIOOC ATAAGOGCAA 



ACIGCAXAAT TCOTTnTA 



TAAADkGTTT TTGXTCICC 

TTXAIGAAAA TICTCACAAA 

TGATTTCACC AAACIGAAAT 
1440 

GAAGAAAAIA AAAAAAAXAG 



1500 

• * 

TTTACTTCAG GTAACACGTT 



ATOGATmC AAGAAATOCT CACICTATAT GAACTTEAAT TIAAACATGT 
1620 

* • 

AaaGMcnr ttctttcaaa tagagactita AGraATTTAA tcatagaaag 
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MU ' 1680 

» » • 

AACCAACGTT ATGTTCATCT AGGCTAGACT GATTTCTCCC 

1740 

» * * * 

GAAAAGCIGT CCTTATGCTT AAATATCTTT CAGCAGCA30L 



AGAAAATATT TCAATATCGT TCTATAAAGG TTCXATAATT 

• • * 

TtTrrrroGC aaatogtita taiagagaaa ctagaactag 

I860 

• • * 

TCEAGGEATA GC5DGTCTTIG AOCICIG GG A TCAATCTAAA 

1920 

• « • 

CTA2TTICEA TCAACTTCTC AGTOTCOGAT GCTCAAAACT 

1980 

• • • 

AArorraT cititcagaa gaggacaaac tattatatgt 

2040 

• * * 

ATOICGTTTC ATACATAAAT ATCTAATAAC AAATTTATTT 

• * • 

ATAACAAAAC TTTATTGAAG AATIGGAAAC TCAAAACGGG 

• * • • 

ACGCTOCACG TCIAGAGGTG TGQGGTTAGT GATTCAAOGG 

2160 

• * * 
TKAGAAACT GTAGATGTAA GATTulTrCT AGGGTTAAGG 

2220 

• • • 

OQGATIATCT LT1T1TJLATG ATAAAAGTIA ATGTCTTAAA 

2280 



TAACAA3TTT 

GTAGTATGAA 
1800 

rrcsnrm 



ATEAATIAGG CAAACTAGAT GATAGZACGT AGTGTICTGTG 

2340 

» • # 

TTOGATATTT TOGGTTAATA GTTACATCTT AGACAAATGT 



• » • 

GAIAAGCTGA GAAAATATTT GGGTOCAGAC TCTTAGTX3GT 



A3CTAGAAAN NCOCANATAC NAATTTAATA CGGCTOTT TIGGGIGAAT 



2460 



GGATOIGACA 
* 

AGAGACCATT 
* 

TAACTTCAAC 

ATATTATCTT 

TTAAAAACAT 
2100 
GACATATAGG 

GTTTTIAATG 
* 

CACTAAACCA 



TGCA3CGCEA 



TGic r roroiA 
gpsgrcriCT 

2400 
AATTAATEAT 
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41 /LU 

GWXCXRQIC CCrAMGMA GCATQGAGAA GGZACT2U5&A. 



2520 



QQCEAGGAGA, G3XAC7CRA. 



1CACCTGRAT cTiutsmsuw 



CCOOGMMA CAGCXSUtMG CTOCAGACTC 



2760 

2820 

TXTTTTGODl AAATTRTICT 
QSGGfflGSXSl TOG&TIAATC 
CAKXCAGCAX AAAGGAGCXA 



cnmcm tidociccag 



AMC&XCXCT ATCITATCTT 
3060 

TOThTflflMT TXBICXCKX31 
3120 



GHMUUyDCR TTTCA3CTCT 



2S80 

TGMCTCOU l GACXCXC3XG& AAPRTRPkTC 
2640 

TTCttMTTfk ACMWOTGA TQCACXBRT 

2700 



* * 

TTQ MC ITIG GGAGJU3GMC OUkMBPCT 



AOTICTAAAT AASBUSTZTRX TQZATDUQTT 



An&GKAM, CACIGQGMT T3UVT3UUUUVA 
2880 

XnyBQOttTH iwiramSA GAITIAGAAT 
2840 

CAGWtTCTGG AGCAGCAGCT TCACACTTCT 

3000 

• • * 

AAAAGTCTCT AAAXAAGCAC AXACAMCGC 



TORCTrSSIC ammw&xa: tgccxmttt 



TGAATCAAXA CMXTKMC 



ASI OCC 1 CM I On O C C OCM l AGAAAGGTRC 
3180 

CAACTCGTAC GTCTUTRTCT CjTCAL'ITATG 
3240 

rauxGrrav aatctttcjjs naAMnucAA aacmmggt tttrcacatc 



3300 

• • # 

TtWaciMT TXGCXGMGG AMUCASRSSX3L AATOttAACA. AAGOQGTnT 
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• * * * * 
TTOGATTGAA TAAAATTTAA CATTCATTCA AAAAAAACAT ATOGTTCAIA 

3360 

• * * * * 

UXKONRGG GTTTATRTGA TTA331TAXAT ATATTTM3VT AGGTCTAATAT 

3420 

***** 

ATnUSTGTTT AAITATA3GT CHMACAXAT AGATCjIAGAA AGAACCTCIA 

3480 

***** 
GAGOGATCCC TCftGAMTCT TICATTTOGT AAAATIGACA GGAGAAAGAA 

3540 

***** 

ATACTOGAGG AAAACAGCAT GCTIGGCAAA CAGGTAATCA TTOngCTIC 

3600 

• * * * * 

cmiTTTrac TGrracacAA cxGrrmcr attiaaactc cacicticia 

***** 
CTCCACTICA ACCKAAACT ACCATTGCTC AACTTTOOGC ADCAACTCTT 

3660 

***** 

TTCTAAAAAG GAftGAATIAG TPGTITCATG TCATTQCTAT AATCATCAGC 

3720 

***** 

AIATGTGCAC ACA3CTAQGT GGQCXTTOXC CGflTOOIRT TAAGCTTGTC 

3780 

***** 

TCCTAGAATT GAACTXGAAC TCsTCTXCTOQ TAATCAIAGT CTATATATAA 

3840 

***** 

CRCOCTGCAC AIACACTU3C CAGTAGGTTT ATTTGAOCAA GATAC — 

3900 

***** 

— TOdCTT ACIGTAAXAC CCTGCCAACA TIGATIGICA TTOGAXACAT 

***** 

AAATTCAGTT GATCATAACG TTTATOQGTA TTTCAAATTC GTAGATAAAG 

3960 

***** 

G&GAGOGAGA GTATCCTAAG GACACATCAA AACCAATCAG AGCAGCAAAA 

4020 

***** 

GCGCAGGCAC CATCTAQCTC CTCJU3CCGCA ACOGCAGTTA AATCCTTACA 

4080 

***** 

TOOCATCATC TCCTTTOCTA AATATGGGGT AACQCX&GXG TTICKITCTr 



4140 
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A3CTTOGTRT AfATATATRC 



TTOU3XCTAT GQGASGKKTT 

* * 
CTGTTAAGTG TGGGGTT2US& 

4260 

ACBIBUCAAX AATIAAEAAG 

4320 
« * 

* » 

auy o oBo osr gaqcmgaac 



MCXGGMOC TIIAjITACTT 



CRttAAAIAA TmXAXMG 



A3T3GTTAGCC A0CR3MCIA 
4560 

AC&3TCACTT A3ACI3UCA3A 
4620 

MCTGA33USA 1TTOCX3USAC 



TTTQCnTQG CCTTTTCTKr 



43/U 

A3MAGATCC 

TCTATCTOIG 
GGntSASGGC 

A3GGAATCAT 

4380 

0QZCXGBB2C 



TGCCGCATOA 
TA33UQUCXGG 



AACCCTCCAG 



ATGCIACACA 
4680 



raaaarar tewuuottt TCMAcrmr totohqct cacagtcaac 

4800 

• * • # * 

AAA3CTTCXG TCAAGAAGTC GmmiTC TOIQGR QCCA CTICG0CM T 



GACACXCTXG GTOrSUSQUk 
4200 

* * 

* * 

TTITjTAACTA CATOICnUSA 



CATOnOCIUl GGAGAAIAIC 



TCRCXC3TGA A C LLA TITAC 
4440 

A1X3GACT0GC CATATOTOQA 
4500 

ACGTMAATA AZAQGCAGCA 



AAATTCTATT TATC — — TT 



ACCAAACICG TCTCCATOOC 



ctccatgact cogactaatt 



gttttbota ArrornrcA atrtacxci 

4740 



Ul ' lL ' mUfl ' GGKICC 
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